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Abstract 

This paper presents a new type analysis for logic programs. The analysis is performed 
with a priori type definitions; and type expressions are formed from a fixed alphabet of 
type constructors. Non-discriminative union is used to join type information from differ- 
ent sources without loss of precision. An operation that is performed repeatedly during 
an analysis is to detect if a fixpoint has been reached. This is reduced to checking the 
emptiness of types. Due to the use of non-discriminative union, the fundamental problem 
of checking the emptiness of types is more complex in the proposed type analysis than 
in other type analyses with a priori type definitions. The experimental results, however, 
show that use of tabling reduces the effect to a small fraction of analysis time on a set of 
benchmarks. 
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1 Introduction 

Types play an important role in programming. They make programs easier to un- 
derstand and help detect errors. There has been much research into types in logic 
programming. A type checker requires the programmer to declare types for each 
predicate in the program and verifies if the program is consistent with the declared 
types (Aiken and Lakshman 1994; Dart and Zobel 1992a; Fages and Coquery 2001; 
Friihwirth et al. 1991; Mycroft and O'Keefe 1984; Reddy 1990; Yardeni et al. 1991; 
Yardeni and Shapiro 1991). A type analysis derives types for the predicates or 
literals in the program from the text of the program (Gallagher and Puebla 2002; 
Charatonik and Podelski 1998; Codish and Lagoon 2000; Gallagher and de Waal 
1994; Heintze and JafFar 1990; Heintze and Jaffar 1992; Lu 1998; Mishra 1984; 
Saglam and Gallagher 1995; Zobel 1987). 

This paper presents a new type analysis that infers types with a priori type defi- 
nitions which determine possible types and their meanings. Types are formed from 
type constructors from a fixed alphabet. This is in contrast to those type analy- 
ses that generate type definitions during analysis. Both kinds of type analysis are 
useful. An analysis that generates type definitions may be favored in compile-time 
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optimizations and program transformations whilst an analysis with a priori type 
definitions may be preferred in interactive programming tools such as debuggers 
because inferred types are easier for the programmer to understand. 

A number of factors compromise the precision of previous type analyses with 
a priori type definitions. Firstly, they only allow deterministic type definitions. A 
function symbol cannot occur more than once in the definition of the same type. 
A type then denotes a tree language recognized by a deterministic top-down tree 
automaton (Comon et al. 2002) and hence called a deterministic type. The restric- 
tion to deterministic type definitions allows fast propagation of type information. 
However, it causes loss of precision because of the limited power of deterministic 
types. The same restriction also prevents many natural typings. For instance, these 
two type rules float—* + (integer, float) and float—* + (float, float) violate the 
restriction. Some previous work even disallows function overloading (Horiuchi and 
Kanamori 1988; Kanamori and Horiuchi 1985; Kanamori and Kawamura 1993), 
which makes it hard to support built-in types. For instance, Prolog has built-in 
type atom that denotes the set of atoms. Without function overloading, atoms such 
as [ ] cannot be a member of another type, say list. Secondly, the type languages 
in previous type analyses with a priori type definitions do not include set union as 
a type constructor. The denotation of the join of two types can be larger than the 
set union of their denotations. For instance, the join of list(integer) and list(float) 
is list (number). Let or be a type constructor that is interpreted as set union. Then 
list(number) is a super-type of or(list(integer), list(float)) since the list [1, 2.5] 
belongs to the former but not the latter. Should non-deterministic type definitions 
be allowed, there is also a need to use set intersection as a type constructor as ex- 
plained in Section 2. Finally, previous type analyses with a priori type definitions 
describe a set of substitutions by a single variable typing which maps variables of 
interest into types. The least upper bound of two variable typings is performed 
point-wise, effectively severing type dependency between variables. 

Our type analysis aims to improve precision by eliminating the above mentioned 
factors. It supports non-deterministic type definitions, uses a type language that 
includes set union and intersection as type constructors and describes a set of sub- 
stitutions by a set of variable typings. All these help improve analysis precision. On 
the other hand, they all incur performance penalty. However, experimental results 
with a prototype implementation show that tabling (Warren 1992) reduces the time 
increase to a small fraction on a suite of benchmark programs. Our type analysis is 
presented as an abstract domain together with a few primitive operations on the do- 
main. The domain is presented for an abstract semantics that is Nilsson's abstract 
semantics (Nilsson 1988) extended to deal with negation and built-in predicates. 
The primitive operations on the domain can be easily adapted to work with other 
abstract semantics such as (Bruynooghe 1991). 

The remainder of the paper is organized as follows. Section 2 provides motivation 
behind our work with some examples and Section 3 briefly presents the abstract 
semantics along with basic concepts and notations used in the remainder of the 
paper. Section 4 is devoted to types — their definitions and denotations. Section 5 
presents the abstract domain and Section 6 the abstract operations. In Section 7, 
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we present a prototype implementation of our type analysis and some experimental 
results. Section 8 compares our type analysis with others and Section 9 concludes. 
An appendix contains proofs. 

2 Motivation 

This section provides motivation behind our type analysis via examples. The pri- 
mary operations for propagating type information are informally illustrated; and 
the need for using set union and intersection as type constructors is highlighted. 

Example 2.1 

This example demonstrates the use of set union as a type constructor. Consider the 
following program and type rules. 

p{Z) <- ©X = a ©,Y = 2.5 @ ,Z = cons(X,cons(Y,nil)) ©. 
«- ®P(Z) ©■% query 

list(fi) — > nil 

list{(3) -o cons{(3, list(0)) 

The two type rules define lists. They state that a term is of type list ((3) iff it is cither 
nil or of the form cons(X, Y) such that X is of type (3 and Y of type list ((3). Type 
rules are formally introduced in Section 4. The program has been annotated with 
circled numbers to identify relevant program points for the purpose of exposition. 

The type analysis can be thought of as an abstract execution that mimics the 
concrete (normal) execution of the program. A program state in the concrete execu- 
tion is replaced with an abstract one that describes the concrete state. The abstract 
states are type constraints. 

Suppose that no type information is given at program point © — the start point 
of the execution. This is described by the type constraint \i\ = true. The execution 
reaches program point © with the abstract state [ii — true. The abstract state 
at program point @ is /Lt3 = (X G atom) which states that X is of type atom. 
The abstract state at program point @ is /i 4 = (X 6 atom) A (Y e float). The 
abstract execution of Z = cons(X,cons(Y,nil)) in ^4 obtains the abstract state 
fi 5 at program point ©. The computation of \i§ needs some explanation. The two 
terms that are unified have the same type after the unification. Since /Z4 does not 
constrain Z , there is no type information propagated from Z to either X or Y. The 
type for Z in ^5 equals the type of cons{X, cons(Y, nil)) in ^4 which is computed in 
a bottom-up manner. To compute the type for nil, we apply the type rule for nil/0. 
The type rule states that nil is of type list([3) for any (3. Thus, the most precise 
type for nil is list(0) where the type denotes the empty set of terms. We omit the 
process of computing the type list(float) for cons(Y,nil) in [14 since it is similar to 
the following. To compute the type for cons(X, cons(Y, nil)), we apply the type rule 
for cons/2. The right hand side of the type rule is cons{(3, list{(3)). We first find the 
smallest value for (3 such that (3 is greater than or equal to atom — the type for X in 
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and the smallest value for (3 such that list ((3) is greater than or equal to list (float) 
the type for consiY, nil) in Those two values are respectively atom and float 
and their least upper bound is or (atom, float). Replacing (3 with or(atom, float) in 
the left hand side of the type rule gives the most precise type list (or (atom, float)) 
for cons(X,cons(Y,nil)) in U4. Conjoining Z G list (or (atom, float)) with /U4 results 
in [15 = ((X G atom) A (Y G /Zoai) A (Z G list (or (atom, float)))). The abstract state 
at program point (e) is u 6 = (Z e list (or (atom, float))) which is obtained from /i 5 
by projecting out type constraints on X and Y. 

The existence of the type constructor or helps avoid approximations. Without it, 
the least upper bound of atom and float is 1 which denotes the set of all terms. 
Note that the collection of type rules is fixed during analysis. I 

When two or more type rules are associated with a single function symbol, there 
is also a need to use set intersection as a type constructor. The following example 
illustrates this point. 

Example 2.2 

Suppose that types are defined by the following four type rules. 
list([3) — > nil 
list((3) — > cons((3, list((3)) 
tree((3) — > nil 

tree(P) —> node(tree(/3), (3,tree(f3)) 

Consider the problem of computing the type for cons(X,nil) in the abstract state 
/1 = (X G integer). 

There are two type rules for nil/0. The type rule list((3)—>nil states that nil 
belongs to Ust(f3) for any f3. The most precise type for nil that can be inferred 
from this rule is list(0). Similarly, the most precise type for nil that can be inferred 
from the type rule tree(j3)—>nil is tree(0). Thus, the most precise type for nil is 
and(Zisf (0), tree(0)) where and is a type constructor that denotes set intersection. 

To compute the type for cons(X,nil), we apply the type rule for cons/2. Its 
right hand side is cons((3, list((3)). We first find the smallest value for (3 such that 
(3 is greater than or equal to integer — the type for X in u. The value is integer. 
We then find the smallest value for (3 such that list((3) is greater than or equal 
to and(foi(0), tree(0)) — the type for nil in fi. This is done by matching list((3) 
with list(0) and with tree(0) and intersecting values for (3 obtained from these 
two matches. The first match results in 0. The second match is unsuccessful and 
produces 1 since we are computing an upper approximation. The intersection of 
these two types is and(0, 1) which is equivalent to 0. The join of the two smallest 
values integer and for [3 is or(integer, 0) which is equivalent to integer. Finally, 
the type list(integer) for cons(X,nil) is obtained by substituting integer for (3 in 
the left hand side of the type rule. 

Without and in the type language, a choice must be made between list(0) and 
tree(0) as the type for nil. Though these types are equivalent to and(list (0), tree(0)), 
the choice made could complicate the ensuing computation. Should tree(0) be cho- 
sen, we would need to find the smallest value for (3 such that list ((3) is greater than 
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or equal to tree(0). This could only be solved by applying an algorithm for solving 
type inclusion constraints. The presence of and allows us to avoid that. I 

For the purpose of improving the precision of analysis, there is also a need for 
disjunction at the level of abstract states. The following example illustrates this 
point. 

Example 2.3 

Consider the following program 

p(X) - q(X,Y), ©... 
9(1,2). 
q(a,b). 

«— p(X). % query 

When the execution reaches program point ©, X and Y are both of type integer or 
they are both of type atom. This is described by a type constraint ((X G integer) A 
(Y G integer)) V ((X G atom) A (Y G atom)). Without disjunction at the level of 
abstract states, we would have to replace the type constraint with a less precise 
one: ((X G or(integer, atom)) A (Y G or(integer, atom))). I 

3 Preliminaries 

The reader is assumed to be familiar with the terminology of logic programming 
(Lloyd 1987) and that of abstract interpretation (Cousot and Cousot 1977). We 
consider a subset of Prolog which contains definite logic programs extended with 
negation as failure and some built-in predicates. 

3.1 Basic Concepts 

We sometimes use Church's lambda notation for functions, so that a function / 
defined f(x) — e will be denoted Xx.e. Let A and B be sets. Then i m 5 is 
the set of total functions from A to B and A> — >B is the set of partial functions 
from A to B. The function composition o is defined fog = Xx.f(g(x)). Let D 
be a set. A sequence over D is either e or d • d where d G D and d is a sequence 
over D. The infix operator • associates to the right and prepends an element to 
a sequence to form a longer sequence. The set of all sequences over D is denoted 
D* . Let d = d\ • di • ■ ■ ■ • d n • e. We will sometimes write d as di, c?2, • ■ ■ , d n . The 
dimension \\d\\ of d is n. Let E C D and S C D*. The set extension of • is defined 
as E» S = {d»d\ de E Ade S}. 

3.2 Abstract Interpretation 

A semantics of a program is given by an interpretation ((C, Cc),C) where (C, Qc) 
is a complete lattice and C is a monotone function on (C, Ec)- The semantics is 
defined as the least fixed point Ifp C of C. The concrete semantics of the program 
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is given by the concrete interpretation ((C, C C ),C) while an abstract semantics 
is given by an abstract interpretation ((A, Qa), -A). The correspondence between 
the concrete and the abstract domains is formalized by a Galois connection (a, 7) 
between (C, C c ) and (A, Qa)- A Galois connection between A and C is a pair of 
monotone functions a : C ^ A and 7 : A 1— > C satisfying Vc 6 C.(c C c 7 o a(c)) 
and Va G A(ao 7(a) a). The function a is called an abstraction function 
and the function 7 a concretization function. A sufficient condition for IfpA to be 
a safe abstraction of Ifp C is Va 6 A. (a o C o 7(a) -4( a )) or equivalently 

Va G A. (Co 7(a) C c 7o.A(a)), according to propositions 24 and 25 in (Cousot and 
Cousot 1992). The abstraction and concretization functions in a Galois connection 
uniquely determine each other; and a complete meet-morphism 7 : A 1— ► C induces 
a Galois connection (a, 7) with a(c) = l~U{a | c Cc 1 7( a )}- A function 7 : i m C 
is a complete meet-morphism iff j(J1aX) = nc{7(x) G X} for any IC A Thus, 
an analysis can be formalized as a tuple (((C, C c ), C), 7, ((A, C^), .4)) such that 
((C, Ec);C) and ((A, Qa),A) are interpretations, 7 is a complete meet-morphism 
from (C, C c ) to (A, C A ), and Va G A(C o 7(a) C c 70.4(a)). 



3.5 Logic Programs 

Let E be a set of function symbols, II a set of predicate symbols and Var a denumer- 
able set of variables. Each function or predicate symbol has an arity which is a non- 
negative integer. We write f/n G £ for an n-ary function symbol / in E andp/n G IT 
for an n-ary predicate symbol p in II. Let V C Var. The set of all terms over E and 
V, denoted Term(E, V), is the smallest set satisfying: (i) V C Term(E, V); and (ii) if 
{ti, ■■ -,t n } C Term(E,F) and //n G X then /(ii, ■• - ,i n ) G Term(E,F). The set of 
all atoms that can be constructed from II and Term(£, V) is denoted Atom(IT, E, V); 
Atom(n, E, V) = {p(ti, ••■,<„) I (p/n G II) A ({t 1} • • • , t„} C Term(n, E, V))}. Let 
Term = Term(E,Var) and Atom = Atom(II, E, Var) for abbreviation. The set Term 
contains all terms and the set Atom all atoms. The negation of an atom p(t\, ■ ■ ■ , t n ) 
is written -^p(ti, • • • , t n ). A literal is either an atom or the negation of an atom. The 
set of all literals is denoted Literal. Let Bip denote the set of calls to built-in predi- 
cates. Note that Bip C Atom. 

A clause C is a formula of the form H <— L\, ■ ■ ■ , L n m where H G Atom U {□} and 
Li G Literal for 1 < i < n. H is called the head of the clause and L\, ■ ■ ■ , L n m the 
body of the clause. Note that □ denotes the empty head and ■ denotes the empty 
body. A query is a clause whose head is □. A program is a set of clauses of which 
one is a query. The query initiates the execution of the program. 

Program states which exist during the execution of a logic program are called 
substitutions. A substitution 9 is a mapping from Var to Term such that dom{9) = 
{x I (x G Var) A (6(x) ^ x)} is finite. The set dom(9) is called the domain of 9. 
Let dom{9) = {x\, ■ ■ ■ , x n }. Then 9 is written as {27 1— ► 6(x\), ■ ■ ■ ,x n 1— > 9(x n )}. 
A substitution 9 is idempotent if 9 o 9 = 9. The set of idempotent substitutions 
is denoted Sub; and the identity substitution is denoted e. Let Subf a u = Sub U 
{fail} and extend o by 9 o fail — fail and fail o 9 = fail for any 9 G Subf a u- 



Improving Precision of Type Analysis Using Non-Discriminative Union 7 

Substitutions are not distinguished from their homomorphic extensions to various 
syntactic categories. 

An equation is a formula of the form I — r where either I, r € Term or I, r G Atom. 
The set of all equations is denoted Eqn. For a set of equations E, mgu : p(Eqn) i— > 
Sub fan returns either a most general unifier for E if E is unifiable or fail otherwise. 
Let mgu(l,r) stand for mgu({l — r}). Define eq{9) = {x — 9(x) \x e dom(9)} for 
9 e Sub and eq(fail) = fail. 

The set of variables in a syntactic object o is denoted vars(o). A renaming sub- 
stitution p is a substitution such that {p(x) | x <E Var} is a permutation of Var. The 
set of all renaming substitutions is denoted Ren. Define Ren (01,02) = {p E Ren | 
vars{p(pi)) (~1 vars{o2) = 0}- 

We assume that there is a function sys : Bip x Sub 1— ► p(Sub) that models the 
behavior of built-in predicates. The set sys(p(t\, ■ ■ ■ ,t n ),6) consists of all those 
substitutions a o 9 such that a is a computed answer to 0(p(t\, ■ ■ ■ , t n )). 

Let Vp be the set of variables in the program and Atomp = Atom(II, S, Vp). 
Define uf : Atomp x Sub x Atomp x Sub 1— ► Subf a u by 

uf (ai, 9,a,2, u>) = let p € Ren(#(ai), a; (02)) in mgu{p(6{a{)), 0^(02)) ou> 

The operation uf{a\, 9, a 2 , cu) models both procedure-call and procedure-exit oper- 
ations. In a procedure-call operation, a\ and 9 are the call and the program state 
before the call, ai is the head of the clause that is used to resolve with the call and 
u> the identity substitution e. In a procedure-exit operation, 02 and oj are the call 
and the program state before the call, a\ is the head of the clause that was used to 
resolve with the call and 9 is the program state after the execution of the body of 
the clause. A renaming is applied to the call in a procedure-call operation whilst in 
a procedure-exit operation it is the head of the clause that is renamed. 

3.4 Abstract Semantics 

The new type analysis is presented as an abstract domain with four abstract op- 
erations. The domain and the operations are designed for an abstract semantics 
in (Nilsson 1988) extended with supports for negation-as-failure and built-in pred- 
icates. The extended abstract semantics is a special case of an abstract semantics 
in (Lu 2003) where a formal presentation can be found. The adaptation of the anal- 
ysis to other abstract semantics such as (Bruynooghe 1991) is straightforward since 
they require abstract operations with similar functionalities. 

The abstract semantics is parameterized by an abstract domain (ASub , The 
elements in ASutf are called abstract substitutions since they are properties of sub- 
stitutions. The abstract domain is related to the collecting domain {p(Sub), C) via 
a concretization function 7 : ASub^ p(Sub). We say that an abstract substitution 
7r describes a set of substitutions 9 iff C 7(71"). As usual, the abstract domain 
and the concretization function are required to satisfy the following conditions. 

CI: < ASub b , C b > is a complete lattice with least upper bound operation U b ; 
C2: j(ASub b ) is a Moore family where 7(A) = LKtM I x e X}. 
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We informally present the abstract semantics using the following program as a 
running example. 



The intended interpretation for member(Jf , L) is that X is a member of list L. 
The intended interpretation for diff(X, L,K) is that X is in L or K but not 
in both. For brevity of exposition, let A = member (X, L); B = member {X, K); 
C = member(X, [X\L]); D = member (X,[H\L}); E = dif f (X, L, K) and F = 
dif f (X, Y, Z). The atom in the literal to the right of a program point p is de- 
noted A(p). For instance, A(2) = A(4) = B. Let H(p) denote the head of the clause 
with which p is associated. For instance, H(l) = H(2) = E. Let p- be the point to 
the left of p if p- exists. For instance, 2~ = 1 whilst 1~ is undefined. 

The abstract semantics associates each textual program point with an abstract 
substitution. The abstract substitution describes all the substitutions that may be 
obtained when the execution reaches the program point. The abstract semantics is 
the least solution to a system of data flow equations - one for each program point. 
The system is derived from the control flow graph of the program whose vertices 
are the textual program points. Let Pt be the set of the textual program points. 
An edge from vertex p to vertex q in the graph is denoted q^-p; and it indicates 
that the execution may reach q immediately after it reaches p. 

Consider the example program. We have Pt = {1, • • • , 13}. The program point 
u = 10 is called the initial program point since it is where the execution of the 
program is initiated. The abstract substitution at i = 10 is an analysis input, 
denoted tt l , and it does not change during analysis. Thus, the data flow equation 
for program point 10 is X (10) = 7r t where X is a mapping from program points to 
abstract substitutions. The data flow equations for other program points are derived 
by considering four kinds of control flow that may arise during program execution. 
The first kind models the execution of built-in calls. For instance, the control may 
flow from program point 10 to program point 11 by executing Y = [a, b\. The data 
flow equation for program point 11 is X b (ll) = Sys b (Y = [a, b], X b (l0)) where 
the transfer function Sys b : Bip x ASub b ^ ASub b emulates the execution of a 
built-in call. Let Pt blp be the set of all the program points that follow the built-in 
calls in the program. We have Pt blp = {11, 12} for the example program. Another 
kind of control flow models negation-as-failure. The transfer function for this kind 
of control flow is the identity function. For instance, the control may flow from 
program point 2 to program point 3 since member(X,K) may fail, which yields this 
data flow equation X b (3) = X b (2). Denote by Pt^ the set of all the program points 
that follow negative literals. We have Pt™-^ = {3, 6} for the example program. 

The third kind of control flow arises when a procedure-call is performed. For 
instance, the control may flow from program point 1 to program point 8. The 
description of data that flow from program point 1 to program point 8 is expressed 



diff(X,L,K) 
d±t±(X,L,K) 
member (X, [X\L]) 
member {X, [H\L]) 



(T) member (X, L), © ^member (X, K) (5) 
@member(X,K), © ^member(X, L) © 

®member(X, L) © 

© Y = [a, b] © Z = [1, 2] © dif f (X, F, Z) © 
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as Uf b {A,X b {\),D,Id b ) where let is an abstract substitution that describes {e}. 
Note that A is the call and D the head of the clause to which program point 8 
belongs. The control may also flow to program point 8 from program points 4, 8, 2 
and 5. The control flows from program point 5 to program point 8 when the negated 
sub-goal member(K,L) is executed. The descriptions of data that How to program 
point 8 from those five source program points are merged together using the least 
upper bound operation U b on ASub b , yielding the following data flow equation. 

X b (8) = Uf b (A,X b (l),D 1 Id b )U b Uf b (B,X b (4),D,Id b ) U b Uf b (A, X b (8), D, Id b ) 
U b Uf b {B,X b {2),D,Id b ) U b Uf b {A,X b {h),D,Id b ) 

The transfer function Uf b : AtompX ASub b x AtompX ASub b ASub b approximates 
Uf : Atomp x p(Sub) x Atomp x p(Sub) p(Sub) defined 

Uf(a u Q 1 ,a2,e 2 ) = {uf(a 1 ,e u a 2 ,e 2 ) ^ fail \9 1 ee 1 A9 2 e 9 2 } 

which is the set extension of uf. Denote by Pt caH the set of program points that are 
reached via procedure-calls. We have Pt caU = {1, 4, 7, 8} for the example program. 

The fourth kind of control flow arises when a procedure exits. For instance, 
the control may flow from program point 3 to program point 13. The descrip- 
tion of data that flow from program point 3 to program point 13 is expressed by 
Uf b (E, X b (3),F, X b (12)) where E is the head of the clause to which program point 
3 belongs and F the call that invoked the clause. The only other control flow to 
program point 13 is from program point 6. Thus, the data flow equation for pro- 
gram point 13 is X b (13) = Uf b (E,X b (3), F,X b (12)) U b Uf b (E, X b (Q), F, X b (l2)). 
Let Pt re * be the set of program points that are reached via procedure-exits. For the 
example program, we have Pt ret = {2, 5, 9, 13}. 

Let Edge- 7 = {q^p \ q £ Pt 1 } where j £ {call,ret,nf,bip}. Note that Edge-* is 
the set of control flows that sink in Pt 7 . The data flow equation has the following 
general form. 

7T t if q = i 

U b {Uf b (A(p),X b (p),W(q)Jd b ) U-peEdge} if<z6Pt ca " 
X b (q) = { U b {Uf b (M(q),X b (q),A(p-),X b (p-)) \ q^p £ Edge} if q £ Pt ret 
X b (q-) ifqePt n/ 
Sys b {A(q-),X b (q-)) tfq£Pt btp 

where n L is the input abstract substitution. The least solution to the system of data 
flow equations is a correct analysis if, in addition to CI and C2, the following local 
safety requirements are met. 

C3: {e} C 7 (M b ); 

C4: Sys(a,j(ir)) C j(Sys b (a, tt)) for any a £ Bip with vars(a) C Vp and 7r £ 
ASub b ] and 

C5: [//(ai, 7(71-1), a 2 , 7 (7r 2 )) C70 Uf b (ai, n 1 , a 2 , n 2 ) for any 7Ti , n 2 £ ASub b , any 
a\,a 2 £ Atomp. 

Note that the condition C2 implies that U b safely abstracts U with respect to 7. The 
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operation Uf is called abstract unification since it mimics the normal unification 
operation whilst Sys b is called abstract built-in execution operation. 

The complete system of data flow equations for the example program is as follows. 



A b (l) 


= Uf b {F 1 X b 


;i2), E,id b ) 


X b (2) 


= Uf b (C,X b 


(7),A,X b (l)) U b Uf b (D,X b (9),A,X b (l)) 


A b (3) 


= X b {2) 




A b (4) 


= Uf b (F,X b 


;i2), E,id b ) 


A b (5) 


= Uf b (C,X b 


(7),B,X b U)) U b Uf b (D,X b (9),B,X b U)) 


X b (6) 


= X b (5) 




X b (7) 


= Uf b (A,X b 


{l),C,Id b ) U b Uf b (B,X b (4),C,Id b ) U b 




Uf b (A,X b 


(8),C,W b ) U b Uf b (B,X b (2),C,Id b ) U b 




Uf b (A,X b 


(5),C,W b ) 


X b (8) 


= Uf b (A,X b 


(l),D,Id b ) U b Uf b (B,X b (4),D,Id b ) U b 




Uf b (A,X b 


{8),D,Id b ) U b Uf b {B,X b (2),D,Id b ) U b 




Uf b (A,X b 


{5),D,Id b ) 


X b (9) 


= Uf b (C,X b 


[7),A,X b (8)) U b Uf b (D,X b (9),A,X b (8)) 


X b (10) 






X b (ll) 


= Sys b (Y = 


a,b], X b (10)) 


X b (12) 


= Sys b (Z = 


l,2],X b (ll)) 


A b (13) 


= Uf b (E,X b 


(3),F,X b (12)) U b Uf b (E,X b (6), F,X b (12)) 



The remainder of the paper presents our type analysis as an abstract domain and 
four abstract operations as required by the above abstract semantics. We begin with 
the type language and type definitions. 



4 Types 

The type language in a type system decides which sets of terms are types. A 
type is syntactically a ground term constructed from a ranked alphabet Cons and 
{and, or, 1, 0} where and and or are binary and 1 and are miliary. Elements 
of Cons U {and, or, 1,0} are called type constructors. It is assumed that (Cons U 
{and, or, 1, 0}) n S = 0. The set of types is Type = Term(Cons U {and, or, 1, 0}, 0). 
The denotations of type constructors in Cons are determined by type definitions 
whilst and, or, 1 and have fixed denotations. 



4-1 Type Rules 

Types arc defined by type rules. A type parameter is a variable from Para. A type 
scheme is either a type parameter or of the form c(/3i, • • • , (3 m ) where c G Cons and 
■ • • , (3 m are different parameters. Let Schm be the set of all type schemes. A type 
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rule is of the form c(pi, • • • , /3 m )— >/ (n, • • • , t„) where c G Cons, //n £ E, • • • , (3 m 
are different type parameters, and tj is a type scheme with type parameters from 
{Pi, • • • , /3 m }. Note that every type parameter in the right-hand side of a type rule 
must occur in the left-hand side. Overloading of function symbols is permitted since 
a function symbol can appear in the right-hand sides of two or more type rules. 
Let A be the set of all type rules. We assume that each function symbol occurs in 
at least one type rule and that each type constructor occurs in at least one type 
rule. Type rules are similar to type definitions used in typed logic programming 
languages Mercury (Somogyi et al. 1996) and Godel (Hill and Lloyd 1994). 

Example 4-1 

Let £ = {0, s(), [ ], [ | ], void, tr(, , )} and Cons = {nat, even, odd, UstQ, tree()}. The 
following set of type rules will be used in examples throughout the paper. 

nat— >0, nat—>s{nai), 
even—>Q, even—>s(odd), 
odd—>s(even), 
list(p)->[ }, 
tree(P)—>void, 

Type rules in A define natural numbers, even numbers, odd numbers, lists and 
trees. I 



list(P)->[P\list(p)] 
tree(p)->tr(p, tree(/3), tree(P)) 



4-2 Denotations of Types 

A (ground) type substitution is a member of TSub = (Para> — >Type) U {T, _L}. The 
application of a type substitution to a type scheme is defined as follows. T(r) = 1 
and J-(t) — for any type scheme r. Let k £ (Para> — >Type). Define k(/3) = for 
each (3 ^ dom(k) where dom(k) is the domain of k. Then k(r) is obtained by re- 
placing each (5 in r with k(/3). For instance, {/3\ list (nat) , l— > nat} (list (Pi)) — 
list (list (nat)) . 

Definition 4-2 

The meaning of a type is defined by a function [] A : Type p(Term). 

Ma = Term 
[0] A = 
[and(i? 1; i? 2 )] A = [i?J A n [i? 2 ] A 
[or(Ri,R 2 )] A = [Ri] A U[R 2 ] A 
[c(R u ---,R m )] A = 

/ letk= {P 3 >-> Rj \ 1 < j < m} \ 

U(c(/3 1 ,-,/3 m )^/(r 1 ,-,r n ))eA m 

V {f(h, ■ ■ ■ ,t n ) I VI < i < n.U e |k(T<)] A } / 

I 

The function [-] A gives fixed denotations to and, or, 1 and 0. Type constructors 
and and or are interpreted as set intersection and set union respectively. The type 
constructor 1 denotes Term and the empty set. We say that a term t is in a 
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type R iff t G [^] A - Set inclusion and [] A induce a pre-order C on types: C 
ft 2 ) = ([-Ri]a — [-ft 2] a) an d an equivalence relation = on types: (Ri = R2) = (Ri E 
ft 2 ) A (R 2 C 

Example 4-3 

Continuing with Example 4.1, we have 

[nat] A = {0,s(0),«(a(0)),---} 

M(0)] A = {[]} 

M(i)1a = {[].N[ ]].■■■} 

where a; G Var. Observe that or (list (even), list (odd)) ^ list(nat) since [0, s(0)] G 
[fci(nat)] A and [0, s(0)] [foi(enen)] A and [0, s(0)] ^ [fo£(odd)] A . I 

The type constructors and and or will sometimes be written as infix operators, 
i.e., and(i?i, R 2 ) is written as (R\ and R 2 ) and or(Ri,R 2 ) as (Ri or R 2 ). A type 
is atomic if its main constructor is neither and nor or. A type is conjunctive if 
it is of the form andi<i<fcvli where each Ai is atomic. By an obvious analogy to 
propositional logic, for any type R, there is a type of the form ori<j< TO Cj such 
that each Cj is conjunctive and R = ofi<i< m Ci. We call or\<i< m Ci a disjunctive 
normal form of R. 

A term in a type may contain variables. This lemma states that types are closed 
under instantiation. 

Lemma 4-4 

Let R G Type and t e Term. If t G [R] A then a(t) <G [R] A for any a e Sit 6. I 

Type rules in A are production rules for a context-free tree grammar (Comon 
et al. 2002; Gecseg and Steinby 1984). The complement of the denotation of a 
type is not necessarily closed under instantiation. For an instance, let A be defined 
as in Example 4.1, x G Var and a = {x ^ s(0)}. Observe that x [naf] A and 
o~(x) G [nai] A . Since x G Term\[nat] A and a(x) ^ Term \ [77,af] A , Term \ [nai] A is not 
closed under instantiation and cannot be denoted by a type in Type. The example 
shows that the family of types is not closed under complement. This explains why 
set complement is not a type constructor. 

Types have also been defined using tree automata (Gecseg and Steinby 1984; 
Comon et al. 2002), regular term grammars (Dart and Zobel 1992b; Smaus 2001; 
Lagoon and Stuckey 2001), and regular unary logic programs (Yardeni and Shapiro 
1991). A type defined in such a formalism denotes a regular set of ground terms. The 
meaning function [-] A interprets a type as a set of possible non-ground terms; in par- 
ticular, it interprets 1 as the set of all terms. Type rules are used to propagate type 
information during analysis. Let x be of type not and y of type list(atom). Then the 
type rule Ust(f3)— >[f3\list((3)] is used to infer that [x\y] is of type list (or (nat, atom)). 
The type parameter (3 is not only used as a placeholder but also used in folding 
heterogeneous types precisely via non-discriminated union operator. 
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4-3 Type Sequences 

During propagation of type information, it is necessary to work with type sequences. 
A type sequence expression is an expression consisting of type sequences of the 
same dimension and constructors and and or. Note that constructors and and or are 
overloaded. The dimension of the type sequence expression is defined to be that of 
a type sequence in it. Let R G Type, R G Type* and Ei and E 2 be type sequence 
expressions. We extend [-] A to type sequence expressions as follows. 

Ha - to 

iR.R] A = [R] A o[R] A 

[E ia ndE 2 ] A = [Ej A n[E 2 ] A 

[E l0 rE 2 ] A = [EJ A U[E 2 ] A 

The relations C and = on types carry over naturally to type sequence expressions. 
An occurrence of (respectively 1) stands for the type sequence of O's (respectively 
l's) with a dimension appropriate for the occurrence. 

5 Abstract Domain 

Abstract substitutions in our type analysis are type constraints represented as a set 
of variable typings which are mappings from variables to types. A variable typing 
represents the conjunction of primitive type constraints of the form x G R. For 
instance, the variable typing {x nat, y i— ► even} represents the type constraint 
(x G nat) A (y G even). The restriction of a variable typing jutoa set V of variables 
is defined as 

t V = Ax. (if x G V then u(x) else 1) 

The denotation of a variable typing is given by 7vt : (Vp i— > Type) p(Sub) 
defined 

7 vt(m) = {0\Vxe V P .(6(x) G \n(x)] A )} 
For instance, 7vt({2; *— » nat,y t— » list(nat)}) = {9 \ 6{x) G [«a^] A A 9{y) G 
[list (nat)] A }. The denotation of a set of variable typings is the set union of the 
denotations of its elements. 

Example 5.1 

For instance, letting S = {{x >—> nat,y list (nat)}, {x i— > list(nat),y nat}}, S 
denotes {6 \ 6(x) G [nat] A A 6(y) G [list(nat)] A } U {6 \ 6(x) G [list(nat)] A A 6(y) G 
[nat] A }. I 

There may be many sets of variable typings that denote the same set of substi- 
tutions. Firstly, two different type expressions in Type may denote the same set of 
terms. For instance, [nat and list(l)] A — [0] A using A in Example 4.1. Secondly, an 
element of a set of variable typings may have a smaller denotation than another. 
For an example, let S = {{x i— > list(l)}, {x t—> list(nat)}} . Then S has the same de- 
notation as one of its proper subset 5' = {{i ^ list(l)}}. Those abstract elements 
that have the same denotation are identified. Let =<: on p(Vp i— > Type) be defined 
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as Si =4 S2 = (U^eSi 7vt(m)) £ (U„es 2 7vt(^))- It is a pre-order and induces an 
equivalence relation « on p(Vp 1— > Type): (<Si « S2) = («Si =^ S2) A (S 2 =4 Si). The 
equivalence classes with respect to « are abstract substitutions. Thus, the abstract 
domain is (ASub b , C b ) where 

ASub b = p(V P 1 > Type),^ 



L = =4/ 



( J 45ufe 1 ', t b ) is a complete lattice. Its join and meet operators are respectively 
[SiLu b [S 2 L - [S1US2L and [Si] K n b [S 2 ] K = [Sj-nS^ where S/ = { M G 
(Vp 1 ► Type) I 3i/ e Sj.(7vT(/u) C 7vt(^))}- The infimum is [0]_ and the supremum 
[{a; 1— ► 1 I x G Vp}]^. The concretization function 7 : ASub b 1— > p(Sub) is defined 

7([5]„) = (J ^t(m) 

The following lemma states that 7 satisfies the safety requirement C2. 
Lemma 5.2 

-f(ASub b ) is a Moore family. I 

The definition of r\ b is not constructive since the downward closure of a set of 
variable typings S can be infinite. For instance, letting S — {{x 1— > list(l)}}, 
{x 1 ► list k (nat)} is in S^ for any k > 1. The following operator : p(Vp 
Type) x p(Vp Type) p(Vp Type) computes effectively the meet of abstract 
substitutions. 

Si ® S 2 = {{a; 1 * (^i(ar) and i/(a;)) | or G Vp} | fi G Si A v G S 2 } 

If Si and S2 are finite representatives of two abstract substitutions then Si ® S2 
is a finite representative of the meet of the abstract substitutions, which is stated 
in this lemma. 

Lemma 5.3 

7 ([S 1 ®S 2 U=7([SiLn b [S2U. I 

We will use a fixed renaming substitution such that Vp (Vp) = and define 
V P = Vp U *(Vp). The relation «, the functions 7vt and 7 and the operator ® 
extend naturally to sets of variable typings over V' P . Let \i be a variable typing, S a 
set of variable typings and 9 a substitution. We say that 9 satisfies /U if 6 G 7vt (m) > 
and we say that satisfies S if 6> G 7([S] ia ). 

The conditions CI and C2 are satisfied. CI holds because (ASub b , Q b , U b ) is a 
complete lattice. C2 is implied by Lemma 5.2. 



6 Abstract Operations 

The design of our type analysis is completed with four abstract operations required 
by the abstract semantics given in Section 3. One operation is U b which is the 
least upper bound on (ASub b ,Q b ). Let Id b = [{Ax G Vp.l}]^. The operation Id 
obviously satisfies the condition C3 and thus safely abstracts {e} with respect to 7. 
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Since abstract built-in execution operation Sys makes use of ancillary operations 
for abstract unification operation Uf b , we present Uf b before Sys b . 

6.1 Outline of Abstract Unification 

The abstract unification operator Uf b takes two atoms and two abstract substitu- 
tions and computes an abstract substitution. The computation is reduced to solving 
a constraint that consists of a set of equations in solved form E and a set of variable 
typings Si. The solution to the constraint is a set of variable typings S . In order 
to ensure that Uf b safely abstracts Uf, S is required to describe the set of all 
those substitutions that satisfy both E and Si. Let E — {x\ — ti, ■ ■ ■ ,x n = t n }. 
The set S is computed in two steps. In the first step, type information about Xi is 
used to derive more type information about the variables in ti. This is a downward 
propagation since type information is propagated from a term to its sub-terms. The 
second step propagates type information in the opposite direction. It derives more 
type information about Xi from type information about the variables in ti. 

For an illustration, let E = {x = [w],y = [w]} and Si — {[i} where fj, = 
{w i — ► 1 , x i — > list (atom or float), y i— ► list(atom or integer)}. During the downward 
propagation step, more type information for w is derived from type information 
for both x and y. Since ^i(x) = list (atom or float) and x = [w], [w] is of type 
list(atom or float). Since there is only one type rule for [-|-]: Iist(f3)—>[f3\list(f})], 
we deduce that w is of type (atom or float). Similarly, we deduce that to is of 
type (atom or integer) since n(y) = list (atom or integer) and y — [w]. So, w is 
of type ((atom or float) and (atom or integer)) that is equivalent to atom. The 
derived type atom for w is used to strengthen \x into v — {w i— » atom, x 
list(atom or float), y i— ► list(atom or integer)}. During the upward propagation 
step, more type information for both x and y is derived type information for w. 
Note that [w] is an abbreviation for [w\[ ]]. By applying the type rule list((i)— >[ ], 
we infer that [ ] is of type list(0). Since v(w) = atom, we derive that [to] is of type 
list(atom) by applying the type rule list([3)->[[3\list([3)]. We deduce that both x 
and y are of type list(atom) since x — [w] and y — [w]. The derived type list(atom) 
for x and y is used to strengthen v, resulting in this singleton set of variable typing 
S = {{w i ► atom,x list(atom),y list (atom)}} . Both the downward and up- 
ward propagation steps in the preceding example produce a single output variable 
typing from an input variable typing. In more general cases, both steps may yield 
multiple output variable typings from an input variable typing. We now present in 
details these two steps. 

6.2 Downward Propagation 

Downward propagation requires propagating a type R downwards (the structure of) 
a term t G Term(E, Vp). Let 6 = {6 | 9(t) G Propagation of R downwards t 

calculates a set of variable typings S (computed as vts(R, t)) such that O C 7([iS] a3 ), 
that is, S describes the set of all those substitutions that instantiate t to a term of 
type R. This is done by a case analysis. If R = 1 then 6 = Sub since 6(t) is in R for 
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any 9 G Sub. Put S = {Xy G V p .l}. Then S satisfies the condition that C 7([<S] ai ). 
If t G Vp then S = {Xy G V p .(if y = t then i? else 1)} satisfies the condition that 

C 7([<5] !B ). Consider the case i? = (i?i or R 2 ). We have 6 = 6i U 2 where 

01 = {di \ ~0i(t) G [R] A } and 2 = {9 2 \ 9 2 (t) G [R 2 ] A }. We propagate the types R x 
and i?2 downwards i separately, obtaining two sets of variable typings Si and 52 
such that 0i C 7([5i] ra ) and 2 C 7([5 2 ] ra ). Put S = Si U 5 2 . Then the condition 
that C 7([«S] JB ) is satisfied. For the case R — R\ and R 2 , S = Si ®S 2 satisfies the 
condition that C 7([5] !S ) where Si and S 2 are obtained as above. Consider the 
remaining case R = c(Ri, ■ ■ ■ , R 2 ) and t = f(ti, • • • , t n ). Assume that there are k 
type rules T 1 , • • • , T k for c/m and f/n and T 3 is c{f3{ , • • • , (3 J m )^>f(r( , ■ ■ ■ , T 3 n ). By 
the definition of [] A , = Ui<j<fe where 

0, = {9 | 9(f(t u ■ ■ -,*„)) G {f(si, ■ ■ ■ , s n ) | VI < i < n.( Si G [k 3 (t 3 ))] a }} 
= {0 | • • • , 0(i n )) G {/(si, ■■■,«„) | VI <i< n.( Si G W{rt ))] A }} 

= {0 | VI < * < n.((9(t0 G [«*(7f ))] A 

= 0j n <d 3 2 n • • • n Q 3 n 

and = {(3\ ' ^ Ri, ■ ■ ■ , /3 3 m ^ i? m } and 0^ = {0 | 9{U) G [k j (t/)] a }. We obtain 
S as follows. We first propagate type k 3 (t 3 ) downwards term ij, obtaining a set of 
variable typings S 3 . We have that Q 3 C 7([<S|] SS ). We then calculate (S- 7 = <g> 
• • • ® S^ for the type rule T 3 . The set 5^ satisfies the condition that <d 3 C 7([<S J '] SS ). 
Finally, we compute 5 = S 1 U • • • U S k . Since 0^ C j([S 3 ]„) and = Ui<j<fe 
5 satisfies the condition that C 7([<S]~). In summary, S = vts(R,t) where vts : 
Type x Term(S, V P ) ^ p{V P ^ Type) is defined 

vts{l,t) = {XyeVp.l} 

vts(R, x) — {Xy G V p .{ii y — x then R else 1)} 
vts((Ri and R 2 ),t) = vts(Ri,t) ® vts(R 2 ,t) 
vts((Ri or R 2 ),t) = vts(Ri,t)Uvts(R 2 ,t) 
vts(c(Ri, • • • , Rm), f(ti,- ■ ■ , t n )) = 

I letk = {fa i-> Rj | 1 < j < to} 

U(c(/3 1 ,...,/3 m )-./(r 1 ,-,r„))eA OT 

V ®i<i<„ vts{k{n),ti) 

where x G f/n G S and c/m G Cons. The first one applies when there are 
multiple applicable alternatives. 

The following lemma states that vts(R,t) describes all the substitutions that 
instantiate t to a term of type R. 

Lemma 6.1 

For any R G Type and t G Term(S, V P ), {9 | 9{t) G [R] A } C j([vts(R,t)]„). I 

We now consider the overall downward propagation given a set of variable typings 
S and a set of equations in solved form E — {xi = ti, ■ ■ ■ ,x n — t n }. Each variable 
typing /i in S is processed separately as follows. We first propagate the type 
downwards U. This results in a set of variable typings vts(fj,(xi), ti) which describes 
all the substitutions that instantiate ti to a term of type fi(xi). We then calculate 
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= vts(fi(xi),t\) <8> • • • <g> vts(p,(x n ),t n ). The set <S M describes all the substitutions 
that instantiate ti to a term of type n(xi) for all 1 < i < n. We finally conjoin <S M 
with {/i}, obtaining {/i}(g>6y which describes all the substitutions that satisfy both /i 
and E. After each variable typing in S is processed, results from different variable 
typings are joined together using set union. The overall downward propagation 
function down : p(Eqn) x p(V P ^ Type) p(V P ^ Type) is defined 

down (E, S) = (J(M® <S> vts(n(x),t)) (1) 

fi£S (x=t)eE 

Example 6.2 

Let V'p = {x,y}, S = {{x hI^h (list(nat) or nat)}} and A be that in Exam- 
ple 4.1. We have vts (list (nat), [x\[ ]]) = {{x nat, y 1}} and vts(nat, [x\[])) = 0. 
So, 

vts(list(nat) or nat, [x\{ ]]) = {{x ^ nat,y i— > 1}} 

and 

down({y = [x\[ }}},S) 

= {{x i ► 1 , y i ► (list(nat) or na£)}} ® {{x naf,;y 1}} 

= M 

where ^ = {x nat,y (list(nat) or nat)}. I 

The following lemma states the correctness of downward propagation. 
Lemma 6.3 

Let S' = down(E,S). Then mgu(6(E)) o 6 G j([S']J for all 6 e 7([«S] J. I 

6.5 Upward Propagation 

We now consider upward propagation of type information. The key step in up- 
ward propagation is to compute a type for a term from those of its variables. 
We first consider how a type rule t-»/(ti, • • • ,r„) can be applied to compute a 
type of f(t\,---,t n ) from types of its top-level sub-terms t\,---,t n . Let Ri be 
the type of ti. A simplistic approach would compute a type substitution k such 
that (Ri, ■ ■ ■ , R n ) C k((Ti, • • • , r„)) and then return k(r) as the type of t. How- 
ever, this leads to loss of precision. Consider the term [x\y] and the type rule 
list((5)— >[(3\list([3)]. Let the types of x and y be (even or odd) and list(0). Then 
the minimal type substitution k such that (even or odd, list(0)) C k((/3, list(/3))) is 
Ik = {/3 i — > (even or odd)}. We would obtain k(list((3)) = list(even or odd) as a type 
of [x\y]. A more precise type of [x\y] is (list (even) or list (odd)). We first compute a 
set of type substitutions /C such that • • • , R n ) Q or\ & fch({T\, ■ ■ ■ , r„)) and then 
return or\^x^( T ) as a type of f(t\, ■ • ■ ,t n ). Continue with the above example. Let 
£ = {{I 3 | - > even}, {/3 odd}}. Then (even or odd, fcrf(O)) C or ke/c k((/3, list((3))). 
We obtain (list(even) or list(odd)) as a type of [x|y]. 
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Definition 6.4 

Let t G Schm, r G Schm*, i? G Type, R G Type* and /C G p(TSub). We say that /C 
is a cover for R and t iff i? C orij e K:Ik(r). We say that IC is a cover for and r iff 
R E or kG K;k(f). I 

Calculating a cover for a type and a type scheme is a key task in upward propa- 
gation of type information. Before defining a function that does the computation, 
we need some operations on type substitutions. 



6.3.1 Operations on Type Substitutions 

We first introduce an operation for calculating an upper bound of two type substi- 
tutions. It is the point-wise extension of or when both of its operands are mappings 
from type parameters to types. Define Y : TSub x TSub 1— > TSub as follows. 



ti Yk 2 



T, if (ki = T) V (k 2 = T); 

k 2 , else if (ki = _L); 

ki, else if (k 2 = _L); 

, {(3 i-> (ki(/3) or k 2 (/3)) | (3 G dom(ki) Udom(k 2 )}, otherwise. 

An operation A : TSub x TSub TSub that calculates a lower bound of type 
substitutions is defined dually: 



ii X k 2 = < 



J., if (ki = _[_) V (k 2 = J_) 

k 2 , else if (ki = T) 

ki, else if (k 2 = T) 

k {(3 i-> (ki(/3) and k 2 (/3)) | /3 G dom(ki) n dom(k 2 )}, otherwise 



The following lemma states that the operations Y and A. indeed compute upper 
and lower bounds of two type substitutions respectively. 

Lemma 6.5 

For any r G Schm and any ki,k 2 G TSub, 

(a) (ki(r) or k 2 (r)) C (ki Yk 2 )(r); and 

(b) (ki(T) and k 2 (r)) = (ki A k 2 )(r). 



While the type substitution operation is a meet homomorphism according to 
Lemma 6.5.(b), it is not a join homomorphism. For an instance, let r = list((3), ki = 
{(3 1 > naf} and k 2 = {/? 1— > list(nat)}. Then ki Y k 2 = {(3 (nai or foi(na<))}, 
(ki Yk 2 )(r) = list(nat or list(nat)), and ki(r) or k 2 (r) = list(nat) or list (list (nat)). 
Observe that (ki(r) ork 2 (r)) ^ (kiYk 2 )(r) since the term [0, [0]] has type list(nator 
list(nat)) but it does not have type (list(nat) or list (list (nat))). 

Let /Ci and IC2 be sets of type substitutions. We say that /Ci and /C 2 are equiv- 
alent, denoted as JC\ = /C 2 , iff (orjjgjeikC 7 ")) = (or^e^M 7 ")) f° r anv type scheme 
r. Define Y> A : p(TSub) x p(TSub) p(TSub) as the set extensions of Y and X 
respectively: 
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/CiY/C 2 = {ki Yk 2 I ki g id Ak 2 e /C 2 } 
£iJ^/C 2 = {ki Ak 2 I ki G /Ci Ak 2 e /C 2 } 

Example 6.6 

Let /Ci = {{/3i 1— » tree(nat) , (3 2 * nat},{(3i 1— > list(nat), [3 2 l— * raai}} and /C 2 = 
{{A list(even), [3 2 l— * even}}. Since even C nai and list(even) □ list(nat), we 
have 

-y f {/3i 1 ^ tree(nat) or list(even) , /3 2 1— > nat or even}, 1 

1 ' 2 \ {/3i 1 — * list(nat) or list (even), f3 2 >— > n<rf or even} J 



{/3i 1— > tree(nat) or list(even), (3 2 1— > nai}, 
{/3i 1— > list(nat), (3 2 1— > nat} 



We also have 



{Pi 1— > (tree(nat) and list(even)), f3 2 1— ► (nai and even)}, 
1— > (list(nat) and list (even)), (3 2 l— > (fiaf and even)} 

(tree(nat) and list (even)), (3 2 1— > even}, 
1— ► list (even), (3 2 even} 

— {{/^l l— * list(even) , (i 2 1— ► even}} 
since (tree(nat) and list(even)) = 0. 



A cover for a type sequence and a type scheme sequence can be computed com- 
positionally according to the following lemma. 

Lemma 6.7 

Let /Ci,/C 2 G p(TSub), i? G Type, r G Schm, ^ G Type* and r G Schm* such 
that ||_R|| = |f||. If i? □ or kiex:i ki(T) and C or k2eK2 k 2 (r) then i? • _R □ 
or ke(/Cl YK 2 )k(r«r). I 

5. 3. ,2 Calculating a Cover 

We now consider how to compute a cover /C for a type i? and a type scheme r. 
In the case i? = 1, /C = {T} is a cover since T(r) = 1; and /C = {^} is a 
cover in the case i? = since _L(r) = 0. Consider the case R = (Ri or R 2 ), a 
cover /Cj can be recursively computed for Rj and r for j = 1,2. We have that 
Rj C ori fe )c i k(r) and hence that (i?i or R 2 ) C or k6 (^ lUK ; 2 )k(r). So, the union of 
K,\ and IC 2 is a cover for R and r. Consider the case R = (Ri and R 2 ). A cover 
/C, can be recursively computed for Rj and r for j = 1,2. Let /C = K,\ )^K, 2 = 
{ki A k 2 I kx G /Ci A k 2 G /C 2 }. Then or keK: (T) = or kie)Cl Ak2€JC2 (kj A k 2 )(r). By 
Lemma 6.5. (b), or^M = or kieK iAk 2 eK 2 (I k i( T ) and M 1 ")) and hence or keK (r) = 
(or kie /Ciki(T)) and (or k2e ^: 2 k 2 (r)). So, K, = K,\ )^K, 2 is a cover for R and r. In the 
case R is atomic and r is a type parameter, K, — {{r 1— > i?}} is a cover for i? and 
t. In the remaining case, R = c(R\, ■ ■ ■ , R m ) and r = d([3\, ■ ■ ■ ,fik))- If c/m = d/k 
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then >—> Rj | 1 < j < m}} is a cover. Otherwise {T} is a cover. In summary, 
the function that computes a cover is cover : Type x Schm p(TSub) defined 

cover(l,r) = {T} 

cover(0, t) — {_L} 

cover((R\ or R 2 ), t) = cover (Ri, r) U cover (R2, r) 

cover((R\ and i?2), T ) = cover (R\, t) ^ cover (R2,t) 

cover(R,(3) = {{{3 ^ R}} 

cover(c(R 1} • • • , i? m ), cZ(/3i, • • • , /3 fe )) = 

f if (c/m) = (d/fc) i/ien ^ Rj | 1 < j < m}} 
{ else {T} 

Example 6.8 

Let Cons be given in Example 4.1. Then, 

cover ({list (nat) and tree(even)), list(j3)) 

= cover (list (nat), list((3)) cover (tree(even), Ust(f3)) 

= {{/3 » nat}} X{T} 
= {{P^nat}} 

I 

The following lemma states that cover (R, r) is a cover for R and r. 
Lemma 6.9 

Let t S Schm, R G Type and JC = cover (R, r). Then R C or^e^^r). I 



6.5.5 Computing a Type 

The type of a term t is computed from those of its variables in a bottom-up manner. 
The types of the variables are given by a variable typing fj,. For a compound term 
t = /(ti, • • • , t n ), a type Ri is first computed from U and /x for each 1 < i < n. Each 
type rule for f /n is applied to compute a type of t. Types resulting from all type 
rules for / jn are conjoined using and. The result is a type of t since conjunctions of 
two or more types of t is also a type of t. For a type rule r— »/(ti, • • • , t„), a cover /Q 
for Ri and Tj is computed for each 1 < i < n. Joining covers for 1 < « < n obtains 
a cover /C for • • • , i?„) and (ti, • • • , t„). The type that is computed from the 
type rule is o^ic^t). Define type : Term(£, V P ) x (V P ^ Type) Type by 

type(x,n) = (i(x) 
type(f(h,- ■■ ,t n ),fi) = and r _ > /( Tli ... irn ) eA (or ke (Y 15 .. Sn CO uer(t M >e(t i , ( ii),T i ))M' r )) 

Example 6.10 

Let ^ = {a; nat, y (list(nat) or nai)}, ki = {/? 1 — ^ nat} and k 2 — {/? 1— ► 0}. By 
the definition of cover, cover(nat,fi) = {ki} and cover (list(0), list(J3)) = {^2}- By 
the definition of type, type(x,[i) — nat and type([], n) — list(0). So, type([x\[ = 
(ki Y k 2 )(list(/3)) = list(nat). I 
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The following lemma says that type(t, fi) is a type that contains all the instances 
of t under the substitutions described by /i. 

Lemma 6.11 



Let t £ Term(E,V£) and fj, £ [V P i-> Type). Then 9{t) £ [type(t,fi)] A for all 

6 £ tvt(m)- ■ 



We are now ready to present the overall upward propagation. For a set S of variable 
typings and a set E of equations in solved form, upward propagation strengthens 
each variable typing fi in S as follows. For each equation x = t in E, type(t, /u) is 
a type of x if variables occurring in t satisfy /i. The overall upward propagation is 
performed by a function up : p(Eqn) x p(V P Type) p(V P ^ Type) defined 



Example 6.12 

Continue with Example 6.10. We have 

up({y = [x\[ ]]},{n}) = {^[y >— ► ((list(nat) or nat) and list(nat))}} 
« {{x nat, y t-^ list (nat)}} 

The correctness of upward propagation is ensured by this lemma. 
Lemma 6.13 



Let S £ p(V P i > Type) and £ e p(Eqn). Then mgu(9(E)) 08 E j([up(E,S)] a ) for 



Algorithm 6.14 defines the abstract unification operation Uf . Given two atoms 
ai,a 2 £ Atomp and two abstract substitutions [<Si] ra , [£2]- £ ASub b , it first applies 
the renaming substitution \& to a\ and Si and computes Eq = eqo mgu^ (ai) , a 2 ). 
If Eq — fail, it returns [0]^ - the smallest abstract substitution which describes 
the empty set of substitutions. Otherwise, it calculates S = (Si ) l+J S 2 where 
l+J : (^(Vp) 1 > Type) x (Vp Type) (Vp 1— > Type). A variable typing represents 
a conjunctive type constraint. If /x and 1/ have disjoint domains then fiVJv represents 
the conjunction of fi and v. The first operand of 1+J is a set of variable typings over 
*(Vp) and the second operand a set of variable typings over Vp. The result of 
1+J describes the set of all the substitutions that satisfy both of its two operands. 
Thus Sq describes the set of all the substitutions that satisfy both ^(iSi) and 1S2. 
Note that S Q £ p(V P ^ Type). The abstract unification operation then calls a 
function solve : Eqn x p(V P ^ Type) 1— > p(V P ^ Type) to perform downward 



6.3.4 Upward Propagation 




if 3t.(x = t) £ E 
then (l-t(x) and type(t,jj)) 




all 9 £ j([S]J. 



6.4 Abstract Unification 
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and upward propagations. The result S[ is solve(Eo,S' ) which describes the set 
of all the substitutions that satisfy both Eq and S' . Finally, it calls a function 
rest : p(V P ^ Type) p(Vp ^ Type) to restrict each variable typing in S[ to Vp. 

Algorithm 6. 14 

let E — eq o mgu(^f(ai), a 2 ) in 

Uf'ia^S^a^SiU = \ ft^^i^ olve{EoMSimS2)L 

else 

Si\SS 2 = {m u v I fj, e 5i A v e S 2 } 
rest{S) = {n T V P \ pt G S A Vx e Vp.(/z(a;) 0)} 
solve{E 1 S) = up(E ', down(E ', 5)) 

The function rest removes those variable typings that denote the empty set of 
substitutions and projects the remaining variable typings onto Vp. 

Example 6.15 

Let Vp = {x}, ^(x) = y,a\ = p(x), a 2 = p([x\[ ]]), Si = {{x 1— ► (list(nat) or na£)}}, 
and S 2 = {{x 1 * 1}}. Then £ = {y = M[ ]]} and *(Si)|+JS 2 = 5 with S being 
that in Example 6.2. By Examples 6.2 and 6.12, 

solve(Eo,S) = up(Eo, down(Eo,S)) = up(Eo,{fj,}) 
= {{x nat,y list(nat)}} 

with /x given in Example 6.12. I 

The following theorem states that Uf b safely abstracts Uf with respect to 7. 

Theorem 6.16 

For any [S{\ x , [S 2 ] x £ ASub b and any a\,a 2 S Atomp, 

[//(ai, 7 ([5i] ra ),a 2 , 7 ([5 2 ] ra )) C 7 ( Uf b (a 1} [SiL, a 2 , [S 2 ]J) 



6.5 Abstract Built-in Execution Operation 

For each built-in, it is necessary to specify an operation that transforms an input 
abstract substitution to an output abstract substitution. These operations are given 
in Table 1 where abstract substitutions are displayed as sets of variable typings. The 
primitive types integer, float, number, string, atom and atomic have their usual 
denotations in Prolog. Observe that number = (integer or float) and atomic — 
(number or atom). 

Unification t\=t 2 is modeled by \S.solve(mgu(t\,t 2 ),S). Let 9 be the program 
state before the execution of t\=t 2 and assume that satisfies S. The program state 
after the execution of t\=t 2 is mgu(6(t\), 0(t 2 ))o8 and satisfies solve(mgu(t\,t 2 ),S). 
Built-ins such as </2 succeed only if their arguments satisfy certain type constraints. 
Such type constraints are conjoined with the input abstract substitution to obtain 
the output abstract substitution. For instance, the execution of t\<t 2 in an input 
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program state 9 succeeds only if 9 instantiates both t\ and i 2 to numbers. So, the 
abstract operation for t\<t 2 is fa — \S.(S®vts (number, ti)<S>vts (number, t 2 )) where 
vts defined in Section 6.2 is extended to deal with built-in types. The extension is 
straightforward and omitted. For another instance, format(t\) succeeds only if t 
is an atom, or a list of character codes or a string in its input program state. 
The above type constraint is obtained as vts(atom or list(integer) or string, tx). 
The type list (integer) describes lists of character codes since character codes are 
integers. The type checking built-ins such as atom/1 are modeled in the same way. 
Built-ins such as @</2 do not instantiate their arguments or check types of their 
arguments. They are modeled by the identity function XS.S. The built-in fail/0 
never succeeds and hence is modeled by the constant function that always returns 
0. 

Consider a built-in to which a call p(t\, ■ ■ ■ ,t n ) will definitely instantiate ti to a 
term of type Ri upon success. The type Ri can be propagated downwards U, result- 
ing in a set of variable typings. The input abstract substitution can be strength- 
ened by this set of variable typings to give the output abstract substitution. For 
an instance, consider name(ti,t 2 ). Upon success, t\ is either an atom or an integer 
and t 2 is a string. So, name(t\, t 2 ) is modeled by XS.(S ® vts(atom or integer, t\) ® 
vts (string, t 2 )). The built-ins length(t\, t 2 ) and compare(ti,t 2 ,t^) fall into this cat- 
egory. 

Consider the built-in var(t). The execution of var(t) succeeds in a program state 
9 iff 9(t) is a variable. All types that contains variables are equivalent to 1. Thus, the 
built-in var(t) is modeled by XS.{/j, \ (j, € «S A type(t, /1) = 1}. The output abstract 
substitution contains only those variable typings in which t has no type smaller 
than 1. The built-in nonvar(t) is modeled by the identity function XS.S since a 
term being a non-variable does not provide any information about its type unless 
non-freeness is defined as a type. So is the built-in ground(t) since a term being 
ground says nothing about its type unless groundness is defined as a type. The 
operation for the built-in compounal(t) makes use of the property that a compound 
term is not atomic. It removes from the input abstract substitution any variable 
typing in which t is atomic. 

7 Implementation 

We have implemented a prototype of our type analysis in SWI-Prolog. The proto- 
type is a meta-intcrprcter using ground representations for program variables. The 
prototype supports the primitive types integer, float, number, string, atom and 
atomic with their usual denotations in Prolog. 

7. 1 Examples 

Example 7.1 

The following is the intersect program that computes the intersection of two lists 
and its analysis result. Lists are defined in Example 4.1. Abstract substitutions are 
displayed as comments. The abstract substitution associated with the entry point 
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Predicate 


Operation 


abort, fail, false 


XSM 


!, ii@<t 2 , ti«>*2, h=<®t 2 , ti@>=i 2 , 


xs.s 


ti\==t2, ti\=t 2 , display(t\), ground(ti), 
listing, listing(t\), nl, nonvar(t\), 
portray jdause(t\), print(t\), read(t\), 
repeat, true, write(t\), writeq(ti) 




compound(t) 


\S.({u | u e S A type(t,p) % atomic}) 


atom(t) 


XS.(S <g> vts(atom,i)) 


atomic(t) 


XS.(S ® vts (atomic, t)) 


float(t) 


XS.(S <g> vts( float, t)) 


erase(t), integer(t), tab(t) 


A<S.(<S® vts(integer,t)) 


number (t) 


A<S.(<S<g) vts (number, t)) 


put(t) 


XS.(S <g> vts(atom or integer, t)) 


string(t) 


XS.(S ® vts(string, t)) 


var(t) 


XS.{u I/16SA type(t, p) = 1} 


ti=t 2 , ti~t 2 


XS.solve(mgu(ti,t2),S) 


format(ti), format(ti,t2), format(to,ti,t2) 


h 


t\<t 2 , t\ > t2, tl=<t 2 , tl>=t2, ti=:=t 2 , 


h 


ti=\=t2, is(t 1 ,t 2 ) 




length(ti,t 2 ) 


h 


compare(ti, t 2 , £3) 


XS.(S ® vts(atom, ti)) 


name(t\,t 2 ) 


h 



Table 1. Abstract operations for built-ins where fi = XS.(S (g) vts (list (l),ti) ® 
vts(integer,t2)), /2 = XS.(S <S> vts(atom or integer, t\) <S> wis (string, t^)), 
fs = XS.(S ® vts (number, ti) (g> wis (number, t?)), and fi = XS.(S ® 
vts(atom or list(integer) or string, ti)). 



of the query is an analysis input whilst all other abstract substitutions are analysis 
outputs. Sets are displayed as lists. A binding V T is written as V/T, or as or 
and and as and. The code for the predicate member/2 is omitted. 

7o [ [X/list (atom or f loat) ,Y/list(atom or integer)]] 
intersect (X,Y,Z) . 

7, [[X/list (atom or f loat) ,Y/list (atom or integer) ,Z/list (atom)]] 

intersect ( [] ,L, [] ) . 

7o [ [L/list (atom or integer)]] 

intersect ( [X I Xs] ,Ys, [X| Zs] ) :- 

7o [ [X/atom,Xs/list (atom or float) ,Ys/list (atom or integer)], 
7, [X/f loat, Xs/list (atom or float) ,Ys/list (atom or integer)]] 
member (X,Ys) , 

7o [ [X/atom, Xs/list (atom or float) ,Ys/list (atom or integer)]] 
intersect(Xs,Ys,Zs) . 

7o [ [X/atom, Xs/list (atom or float) ,Ys/list (atom or integer), 
7. Zs/list(atom)]] 
intersect( [X I Xs] ,Ys,Zs) :- 

7 [ [X/atom, Xs/list (atom or float) ,Ys/list (atom or integer)], 
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7 [X/f loat ,Xs/list (atom or float) ,Ys/list (atom or integer)]] 
\+ member (X, Ys) , 

7o[ [X/f loat, Xs/list (atom or float) ,Ys/list (atom or integer)], 

7, [X/atom, Xs/list (atom or float) ,Ys/list (atom or integer)]] 
intersect(Xs,Ys,Zs) . 

7o[ [X/f loat, Xs/list (atom or float) ,Ys/list (atom or integer), 

7 Zs/list (atom)] , 

7 [X/atom, Xs/list (atom or float) ,Ys/list (atom or integer), 
7. Zs/list (atom)]] 

The result shows that the intersection of a list containing atoms and float numbers 
and another list containing atoms and integer numbers is a list of atoms. This is 
precise because the type ((atom or float) and (atom or integer)) is equivalent to the 
type atom. Without the set operators and and or in their type languages, previous 
type analyses with a priori type definitions cannot produce a result as precise as 
the above. I 

Example 7.2 

The following is a program p/1. The analysis result is displayed with the typing 
binding x 1 omitted for any variable x. 

p([]). 7. [[]] 

p([X|Y]) :- 

7. [[]] 
integer (X) , 

7. [[X/ integer]] 
p(Y). 

7o [[X/integer, Y/list (or (atom, integer))]] 

p([X|Y]) :- 

7. [[]] 
atom(X) , 

7. [[X/atom]] 
p(Y). 

7o [[X/atom, Y/list(or(atom, integer))]] 

:- 7. [[]] 
p(U). 

7. [[U/list(or(atom, integer))]] 

The result captures precisely type information in the success set of the program, 
that is, U is a list consisting of integers and atoms upon success of p(U). 1 I 



1 This example was provided by an anonymous referee of a previous version of this paper. 
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During analysis of a program, the analyzer repeatedly checks if two sets of variable 
typings are equivalent and if a set of variable typings contains redundant elements. 
Both of these decision problems are reduced to checking if a given type denotes the 
empty set of terms. 



7.2 Emptiness of Types 

Type rules in A are production rules for a context-free tree grammar in restricted 
form (Gecseg and Steinby 1984). According to (Lu and Cleary 1998), if 1 denotes 
the set of all ground terms instead of all terms then each type denotes a regular 
tree language. We now show how an algorithm in (Lu and Cleary 1998) can be 
used for checking the emptiness of types. We first extend the type language with 
the complement operator ~ and define eType = Term(Cons U {<~, and, or, 1,0}, 0). 
Observe that Type C eType and that [-] A is not defined for elements in eType \ 
Type. Since the algorithm in (Lu and Cleary 1998) was developed for checking 
the emptiness of types that denote sets of ground terms, we need to justify its 
application by closing the gap between the two different semantics of types. This 
is achieved by extending the signature £ with an extra constant g (g 6 S) that is 
used to encode variables in terms. Use of extended signatures in analysis of logic 
programs can be traced to (Gallagher et al. 1995) where extra constants are used to 
encode non-ground terms. In fact, by introducing an infinite set of extra constants 
one can obtain an isomorphism between the set of all terms in the original signature 
and the set of the ground terms in the extended signature. 

Definition 7.3 

The meaning of a type in eType is given by a function ((-)) A : eType p(Term(E U 

M,0). 

«1» A = Term(£uM,0) 

«0»A = ^ 
«~fl» A = «l» A \«i?» A 

((and( J R 1 ,i? 2 ))) A = «fli» A n«ifc» A 



((or(R 1 ,R 2 ))) A = ((R 1 )) A U((R 2 )) A 
(c(R lr --,R m ))) A = 



u 



(c(/3i,-,/3 m )->/(Ti,-,T„))€A 




~ Rj | 1 < j < m} 
■■,*») I VI <*< n.ti E ((k(7i)» A } 



There are two differences between ((-)) A and [] A . Firstly, ~ is interpreted as set 
complement under ((-)) A whilst it has no denotation under [] A . Type constructor 
~ can be interpreted as set complement by ((-)) A because ((R)) A is a regular tree 
language for any R e eType (Lu and Cleary 1998). It cannot be interpreted as set 
complement by [-] A because the complement of [R] A is not closed under instantia- 
tion. Secondly, the universal type 1 denotes Term(E,Var) in [-] A whilst it denotes 
Term(SU{g}, 0) in ((-)) A . An implication is that a type denotes a set of terms closed 
under instantiation under [] A whilst it denotes a set of ground terms under ((-)) A . 
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Let x '■ Term(£, Var) Term(S U {g} 7 0) be defined x{ x ) — Q f° r au x S Var and 
x(/(*i> " ' ' itn)) = f{x{ti)i • ' ' iX{tn))- The function x(-) transforms a term into a 
ground term by replacing all variables in the term with the same constant g. The 
following theorem states that, given a term t and a type R, the membership of t in 
[R] A is equivalent to that of x(t) in ({R)) A - 

Theorem 7.4 

For any term t in Term(I], Var) and any type R in Type, i G [i?] A iff x(t) £ ((R)) A . 

I 

As a consequence, checking the emptiness of a type under [] A can be reduced to 
checking the emptiness of the type under ((■)) A , and vice versa. Therefore, whether 
[R] A = can be decided by employing the algorithm developed in (Lu and Cleary 
1998) that checks if ((R)) A = 0. The following corollary of the theorem allows us to 
reduce a type inclusion test under [-] A to a type inclusion test under ((■)) A . 

Corollary 7.5 

For any R u R 2 e Type, [i?J A C [R 2 ] A iff ((R 1 )) A C ((R 2 )) A . I 

In order to reduce the decision problems to the emptiness of types, we need to 
extend the syntax for type sequence expressions with the operator ~ and ((-)) A to 
type sequences. The expression is a type sequence expression whenever E is a 
type sequence expression. Let R be a type in eType, R a type sequence in eType*, 
Ei and E 2 be type sequence expressions. Define ((e)) A = {e}, ((R • R)) A = ((R)) A • 
((R)) A , ((Ei and E 2 )) A = ((Ei» A D ((E 2 )) A , ((Ex or E 2 » A = ((Ei» A U ((E 2 )) A and 
((~ E)) A = ((1)) A -((E)) A . It can be shown that both Theorem 7.4 and Corollary 7.5 
carry over to type sequence expressions that do not contain ~. 

Set inclusion and ((-)) A induces an equivalence between types and type sequence 
expressions. Let Rt = R 2 iff {(Ri)) A - ((i?2>) A and Ei = E 2 iff ((Ei)) A = ((E 2 )) A . 
The following function eliminates the complement operator ~ over type sequence 
expressions. 

fKis/i(~(or ie jEi)) = and ieI push(~Ei) 
pus/i(~(and ie /Ej)) = or ieI push(~~E l ) 
push(~(Ri,R 2 ,- ■ ■ ,Rk)) = ori<j< fe (l,---,l,~ii/,l, •••,!) for k > 1 

l-l k-l 

It follows from De Morgan's law and the definition of ((-)) A that pus/i(~E) = 
~E. Note that the complement operator ~ does not apply to any type sequence 
expression in push(^~E): it only applies to type expressions. Let R G eType and 
define etype(R) = {((R)) A = 0). The formula etype(R) is true iff R = is true. By 
Theorem 7.4, if R £ Type then etype(R) is true iff R = is true. 



7.3 Equivalence between Sets of Variable Typings 

An indispensable operation in a static analyzer is to check if a fixpoint has been 
reached. This operation reduces to checking if two sets of variable typings denote 



28 



Lunjin Lu 



the same set of concrete substitutions. This equivalence test is reduced to checking 
emptiness of types as follows. Let Vp = {x\, ■ ■ ■ ,Xk} and Si,S 2 € p(Yp l— > Type). 
By definition, Si « S 2 iff \J^ Sl 7vt(m) Q LUs 2 7vt(» and LUs 2 7vt(» C 
U Me5l 7vt(a*)- Suppose Si = {ui, /U 2 , • • • , u m } and S 2 = {1^1, f 2 , • • • , z^}- We con- 
struct Ri,B,2, ■•■ , i? m and Ti,T 2 , • • • ,T„ as follows. = (/Zi(ari), ^(x 2 ), • • • , fJ,i(xk)} 
and 7} = (^(xi),^-(a; 2 ),---,^(x fc )}. Then U Me5l 7vt(m) C LUs 2 7VTM is true 
iff ori<jiij C ori<jTj is true. By Corollary 7.5, ori^Ri C or\<jTj is true iff 
(ori<jflj) and ^(ori<jTj) = is true. The latter can be reduced to emptiness 
of types as shown in (Lu and Cleary 1998). 

Example 7.6 

Let A be given as in Example 4.1, Vp = {x, y}, Si = {/ii, /x 2 } and S 2 = {^3} where 

fii = {x 1 — ► list (even), y 1— > list(nat)} 

[j,2 = {x 1— > list(odd),y ^ list(nat)} 

[13 = {x 1— > list(even) or list(odd),y t— ► list(nat)} 

The truth value of (tvt(mi) U 7vt(/^ 2 )) C 7vt(M3) i s decided by testing emptiness 
of types as follows. Let 

= (list(even), list(nat)) 
R2 = (list(odd), list(nat)) 
T\ = (list(even) or list(odd), list(nat)) 

Then (7vt(mi)U7vt(m 2 )) C 7vt(M3) iff (-Ri or # 2 ) and ~7\ = which, by replacing 
~Ti with push(^Ti) and distributing and over or, is equivalent to the conjunction 
of the following formulas. 

(list(even), list(nat)) and (1, ^list(nat)) = 
(list (odd), list (not)) and (1, ^list(nat)) = 
(list(even), list(nat)) and (^list(even) and ^ list (odd), 1) = 
(list (odd), list (nat)) and (^list(even) and ^list(odd),!) = 

The first of the above holds iff either list(even) = or list(nat) and ~list(nat) = 0, 
both of which are emptiness tests on types. Since list(nat) and ^list(nat) = 0, the 
first formula is decided to be true. The other three can be decided to be true 
similarly. Therefore, (7vt(a*i) U7vt(M2)) C 7v~r(A i 3) holds. In a similar way, we can 
show that 7vt(a*3) Q (tvtO^i) U7vt(a*2)) holds. So, Si « S 2 . I 

7.4 Redundancy Removal 

For the sake of an efficient implementation, an abstract substitution [S]_ should 
be represented by a set of variable typings that does not contain redundancy. A 
set of variable typings can be redundant in two ways. Firstly, a variable typing /i 
in S may denote the empty set of substitutions i.e., fi(x) = for some x E Vp. 
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Secondly, a variable typing \i in S can be subsumed by other variable typings in 
that 7vt(a*) S Ui/gsaj/^ 7vt(^)- In both cases, S \ and S denote the same 
set of substitutions and fi can be removed from S. Suppose Vp = {xi, ■ ■ ■ , Xk}- 
The detection of 7vt(m) = reduces to etype(n(x-\)) V • • • V etype(/j,(xk)) while the 
detection of jyj(p) C {J ueSAu ^^ 7vt(^) can be reduced to checking emptiness of 
types as in Section 7.3. 

Example 7.7 

Let A be given Example 4.1, Vp = {x,y}, S = {^1,^2,^3} where 

fii = {x list(even),y 1— ► list(nat)} 
/i2 = {2; 1— > list(odd),y 1— ► list(nat)} 
fi 3 = {x list(nat),y list(nat)} 

We now show how ^ii is decided to be redundant in 5. Let 

-Ri = {list (even), list(nat)) 
R 2 = (list(odd), list(nat)) 
i?3 = (list (nat) , list (nat)) 

Then 7vt(mi) !•= 7vt(/^2) U 7vt(M3) holds iff i?i and ~(i?2 or R3) = holds iff 
Ri and and ^R s = holds. The latter, after replacing ~R 2 and ^R^ with 
push(~R,2) and push(^Rs) respectively and distributing and over or, is equivalent 
to the conjunction of the following formulas. 

(list(even),list(nat)) and (1,^ list (nat)) and (1,^ list (nat)) = 

(list(even), list(nat)) and (1, ^list(nat)) and (^ list (nat), 1) = 

(list (even), list (nat)) and (~ list (odd), 1) and (1, ^list(nat)) = 

(list(even), list(nat)) and (^list(odd) , 1) and (^list(nat), 1) = 

Each of the above can be decided to be true by testing emptiness of types as in 
Example 7.6. Therefore, \i\ is redundant in S. In a similar way, ^ 2 is decided to 
be redundant in S \ {/ii}. So, S « {^3}. That ^3 is not redundant in S is decided 
similarly. I 

7.5 Tabling 

The operations in our type analysis are complex because of non-deterministic type 
definitions and non-discriminative union at both the level of types and the level 
of abstract substitutions. The equality of two abstract substitutions in an analysis 
without these features can be done in linear time (Horiuchi and Kanamori 1988; 
Kanamori and Horiuchi 1985; Kanamori and Kawamura 1993; Lu 1995). The same 
operation is exponential in our type analysis because deciding the emptiness of a 
type is exponential. This indicates that our type analysis could be much more time 
consuming. 

As shown later, there is a high degree of repetition in emptiness checks during 
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Table 2. Time Performance 



Program 


Program Points 


Goal 


Time 


browse 


103 


q 


110 


cs_r 


277 


pgenconfig(C) 


661 


disj_r 


132 


top(K) 


171 


dnf 


77 


go 


200 


kalah 


228 


play(G, R) 


590 


life 


100 


life(MR, MC, LC, SFG) 


89 


met a 


89 


interpret(G) 


50 


neural 


341 


go 


250 


nbody 


375 


go(M, G) 


281 


press 


318 


test_press(X, Y) 


161 


serialize 


37 


go(S) 


80 


zebra 


43 


zebra(E, S, J, U, N, Z, W) 


40 




Sum = 2120 




Sum = 2683 



the analysis of a program. Making use of this observation, we have reduced time 
increase to 15% on average using a simple tabling technique. We memoize each call 
to etype(R) and its success or failure by asserting a fact $etypeJabled(R, Ans) . 
The fact $etypeJtabled(R,yes) (resp. %etypeJtabled{R, no)) indicates that etype(R) 
has been called before and etype(R) succeeded (resp. failed). The tabled version of 
etype(R) is etypeJabled(R). It first checks if a fact $etypeJabled(R, Ans) exists. 
If so, the call etypeJabled(R) succeeds or fails immediately. Otherwise, it calls 
etype(R) and memoizes its success or failure. 

We now present some experimental results with the prototype analyzer. The 
experiments were done with a Pentium (R) 4 CPU 2.26 GHz running GNU/Linux 
and SWI-Prolog-5.2.13. 

7.5.1 Time Performance 

Table 2 shows analysis time on a suite of benchmark programs. Each row except 
the last one corresponds to a test case. The first three columns contain the name, 
the size of the program in terms of the number of program points and the top-level 
goal. The top abstract substitution which contains no type information is used as 
the input abstract substitution for each test case. These test cases will be used 
in subsequent tables where only the program names are given. The fourth column 
gives analysis time in milliseconds. The time is obtained by running the analyzer ten 
times on the test case and averaging analysis time from these runs. Timing data in 
other tables are also obtained in this way. The table shows that the analyzer takes 
an average of 1.27 milliseconds per program point. 

7.5.2 Repetition of Emptiness Checks 

Table 3 shows that there is a high degree of repetition in emptiness checks during 
analysis. Each test case corresponds to a row of the table. The first column of the 
row is the name of the program, the second is the total number of emptiness checks 
that occur during analysis. The third column gives the number of different types 
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Table 3. Repetition of Emptiness Checks 



Program 


Total 
Checks 


Different 
Checks 


Degree of 
Repetition 


browse 


3050 


64 


47.65 


cs_r 


23846 


53 


449.92 


disj_r 


4500 


37 


121.62 


dnf 


6290 


9 


698.88 


kalah 


31182 


86 


362.58 


life 


3277 


24 


136.54 


meta 


468 


13 


36.00 


neural 


7985 


131 


60.95 


nbody 


8567 


39 


219.66 


press 


1734 


23 


75.39 


serialize 


2019 


37 


54.56 


zebra 


947 


22 


43.04 








Ave.=192.23 



that are checked for emptiness. The fourth column gives the average repetition of 
emptiness checks, which is the ratio of the second and the third columns. While 
the total number of emptiness checks can be very large for a test case, the number 
of different emptiness checks is small, exhibiting a high degree of repetition in 
emptiness checks. The repetition of the emptiness checks ranges from 36.00 to 
698.88. The weighted repetition average is about 192.23. This motivated the use of 
tabling to reduce the time spent on emptiness checks. 

7.5.5 Effect of Tabulation 

Table 4 illustrates the effect of tabling. Statistics are obtained by running the 
analyzer with and without tabling. For both experiments, we measured analysis 
time and time spent on emptiness checks. The table shows that tabling reduces 
analysis time to ^g. The table also gives the proportion of analysis time that is 
spent on emptiness checks. An average of 53% of analysis time is spent on emptiness 
checks without tabling while only a negligible portion of analysis time is spent on 
emptiness checks with tabling. 

7. 6 Cost and Effect of Precision Improvement Features 

The precision improvement features in our type analysis all incur some performance 
penalty. In order to evaluate the effect of these features, we also implemented a sim- 
plified type analysis. The simplified analysis is obtained by removing the precision 
improvement features from the full-fledged analysis. In the simplified analysis, type 
expressions do not contain the constructors or or and; an abstract substitution 
is simply a variable typing; and non-deterministic type definitions arc disallowed. 
Function overloading is still allowed. Abstract operations are simplified accordingly. 
For instance, since (list (list (nat)) or list(nat)) is not in the type language of the 
simplified analysis, the least upper bound of list (list (nat)) and list(nat) is list(l). 
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Table 4. Effect of Tabulation 



Program 


With Tabling 


Without Tabling 


Analysis 
Time 


Check 
Time 


Proportion 


Analysis 
Time 


Check 
Time 


Proportion 


browse 


110 


10 


0.09 


269 


129 


0.47 


cs_r 


661 








1700 


891 


0.52 


disj_r 


171 








359 


157 


0.43 


dnf 


200 








581 


363 


0.62 


kalah 


590 








1939 


1124 


0.57 


life 


89 








210 


117 


0.55 


meta 


50 








60 


21 


0.35 


neural 


250 


30 


0.12 


731 


469 


0.64 


nbody 


281 








620 


314 


0.50 


press 


161 








250 


29 


0.11 


serialize 


80 








179 


82 


0.45 


zebra 


40 








70 


31 


0.44 




Sum=2683 




Ave -0.01 


Sum=6968 




Ave.=0.53 



Table 5. Cost and Effect of Precision Improvement Features 



Program 


Simplified 


Full-fledged 


Time 


Precision 




Analysis 


Analysis 


Ratio 


Ratio 




Time 


Time 






browse 


100.00 


110.00 


0.90 


0.74 


cs_r 


589.00 


661.00 


0.89 


0.77 


disj_r 


151.00 


171.00 


0.88 


0.73 


dnf 


190.00 


200.00 


0.95 


0.00 


kalah 


420.00 


590.00 


0.71 


0.49 


life 


89.00 


89.00 


1.00 


0.35 


meta 


40.00 


50.00 


0.80 


0.05 


neural 


200.00 


250.00 


L 0.80 


0.49 


nbody 


231.00 


281.00 


0.82 


0.23 


press 


159.00 


161.00 


0.98 


0.95 


serialize 


80.00 


80.00 


1.00 


0.83 


zebra 


40.00 


40.00 


1.00 


0.35 








Ave. =0.85 


Ave. =0.54 



The least upper bound operation on abstract substitutions is the point-wise exten- 
sion of the least upper bound operation on types. 

Table 5 compares two type analyses. The two analyses are performed on each 
test case with the same input type information. The input abstract substitution for 
the full-fledged analysis is a singleton set of a variable typing. The corresponding 
abstract substitution for the simplified analysis is the variable typing. For each test 
case, the table gives analysis times by the two analyzers and their ratio. The relative 
performance of the two analyzers varies with the test case. On average, the simplified 
analysis takes 85 percent of the analysis time of the full-fledged type analysis. This 
illustrates that the precision improvement features does not substantially increase 
analysis time. 

The fifth column in Table 5 gives information about the effect of the precision 
improvement features. For each program, it lists the ratio of the number of the pro- 
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gram points at which the full-fledged analysis derives more precise type information 
than the simplified analysis over the number of all program points. Whether or not 
these features improve analysis precision depends on the program that is analysed. 
For some programs like dnf and meta, there is little or no improvement. For some 
other programs like press and serialize, there is a substantial improvement. On av- 
erage, the full-fledged analysis derives more precise type information at 54% of 
the program points in a program. This indicates that the precision improvement 
features is cost effective. 



7. 7 Termination 

The abstract domain of our type analysis contains chains of infinite length, which 
may lead to non-termination of the analysis of a program. 

Example 7.8 

Let the program consist of a single clause p (x) : - © p ( [x] ) where © is a label of 
a program point. Let the query be of the form :- p(u) with u being of type not. 
Then x is a term of type list 1 (nat) at the i th time the execution reaches the program 
point ©. Thus, the chain of the abstract substitutions at the program point © is 

{{x i ► list(nat)}} 

{{x i — ► list(nat)}, {x list (list (nat))}} 
{{x i-> list 3 (nat)} | < j < k} 

which is infinite. The program is an instance of polymorphic recursion (Kahrs 1996) 
which is prohibited in ML. I 

The analyzer uses a canonical representation of types and a depth abstraction 
to ensure termination. A conjunctive type is compact if it contains no duplicated 
type atoms. A type in disjunctive normal form is compact if it contains no dupli- 
cated conjuncts and all of its conjuncts are compact. A type is canonical if it is 
in disjunctive normal form, it is compact and all arguments of its type atoms are 
canonical. For every type R, a canonical equivalent of R - a canonical type R c such 
that R c = R - can be obtained as follows. A disjunctive normal form R' of R is 
first computed. Each argument of each type atom in R' is then replaced with its 
canonical equivalent, resulting in a type R" . Finally, R c is obtained by deleting 
duplicate type atoms in each conjunct of R" and then deleting duplicate conjuncts. 
Let cn(R) denote the canonical equivalent of R obtained by the above procedure. 
For instance, cn(tree(tree(list(l) or list(l)))) = tree(tree(list(l))) . 

Let R be a type. An atomic sub-term A of R is both a sub-term of R and a type 
atom. The depth of A in R is the number of the occurrences of type constructors 
in Cons on the path from the root of R to but excluding the root of A. Thus, the 
depth of the only occurrence of list(nat) in tree(tree(list(even) or list (list (nat)))) 
is 3 and the depth of the only occurrence of list(even) in the same type is 2. Note 
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that type constructors and and or are ignored in determining the depth of A in R. 
If the depth of A in R is k then A is called an atomic sub-term of R at depth k. The 
depth of R is defined as the maximum of the depths of all its atomic sub-terms. 

Definition 7.9 

Let R be a type and k a positive integer. The depth k abstraction of R, denoted 
as dk(R), is the result of replacing each argument of any atomic sub-term of R at 
depth k by 1. I 

For instance, 

di(tree(tree(list(even) or list (list (nat))))) 
= tree (tree (list (1) or list(l))) 

During analysis, the abstract substitution for a program point is initialized to 
the empty set of variable typings. It is updated by adding new variable typings and 
removing redundant ones. The analyzer ensures termination as follows. For each 
program point, the analyzer determines a depth k the first time a non-empty set 
of variable typings So is added. The depth k is the maximum of the depths of the 
types occurring in So plus some fixed constant k with fc > 0. After that, each 
time a set of variable typings S is added, each type R occurring in S is replaced 
by cn(dk(cn(R)j). The above abstraction preserves analysis correctness because 
R' C dk(R') and cn(R) = R. The number of depth k abstractions of the canonical 
types occurring in the abstract substitution is bounded and so is the number of 
variable typings in the abstract substitution. This ensures termination. 

Example 7.10 

Continue Example 7.8 and let ko = 1. We have <So = {{x i— > list(nat)}} and hence 
k = 2 since the depth of the only type list(nat) in So is 1. The chain of the abstract 
substitutions for the program point © is 

{{x \— > list(nat)}} 

{{x i — ► list (nat)}, {x <— > list (list (nat))}} 

{{x i ► list(nat)}, {x i-> list (list (nat))}, {x ^ list (list (list (1)))}} 
The last in the chain is the final abstract substitution for the program point ©. I 

8 Related Work 

There is a rich literature on type inference analysis for logic programs. Type anal- 
yses in (Friihwirth et al. 1991; Gallagher and de Waal 1994; Gallagher and Puebla 
2002; Mishra 1984; Zobel 1987) are performed without a priori type definitions. 
They generate regular tree grammars, or type graphs (Van Hentenryck et al. 1995; 
Janssens and Bruynooghe 1992) or set constraints (Heintze and Jaffar 1990; Heintze 
and Jaffar 1992) as type definitions. These different formalisms for expressing type 
definitions are equivalent. A type graph is equivalent to a regular tree grammar 
such that a production rule in the grammar corresponds to a subgraph that is com- 
posed of a node and its successors in the graph. For a system of set constraints, 
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there is a regular tree grammar that generates the least solution to the system of 
set constraints, and vice versa (Cousot and Cousot 1995). The production rules in 
a regular tree grammar arc similar to type rules used in our analysis but are not 
parameterized. This kind of analysis is useful for compiler-time optimizations and 
transformations but inferred type definitions can be difficult for the programmer to 
interpret. Like those in (Horiuchi and Kanamori 1988; Kanamori and Horiuchi 1985; 
Barbuti and Giacobazzi 1992; Kanamori and Kawamura 1993; Codish and Demoen 
1994; Lu 1995; Saglam and Gallagher 1995; Codish and Lagoon 2000; Lu 1998; Hill 
and Spoto 2002), our type analysis is performed with a priori type definitions. The 
type expressions it infers are formed of given type constructors. Since the meaning 
of a type constructor is given by a priori type definitions that are well understood 
to the programmer, the inferred types are easier for the programmer to interpret 
and thus they are more useful in an interactive programming environment. 

The type analyses with a priori type definitions in (Horiuchi and Kanamori 1988; 
Kanamori and Horiuchi 1985; Kanamori and Kawamura 1993; Lu 1995) are based 
on top-down abstract interpretation frameworks. They are performed with a type 
description of possible queries as an input and are thus goal-dependent. They infer 
for each program point a type description of all the program states that might be 
obtained when the execution of the program reaches that program point. These are 
also characteristics of our analysis. However, these analyses do not support non- 
deterministic type definitions or non-discriminative union at the levels of types and 
abstract substitutions. The analysis in (Lu 1998) traces non-discriminative union 
at the level of abstract substitutions but not at the level of types. In addition, it 
does not allow non-determinism in type definitions. The above mentioned top-down 
type analyses with a priori type definitions approximate non-discriminative union 
of two types by their least upper bound. The least upper bound may have a strictly 
larger denotation than the set union of the denotations of the two types since set 
union is not a type constructor. Thus, our type analysis is strictly more precise 
than (Horiuchi and Kanamori 1988; Kanamori and Horiuchi 1985; Kanamori and 
Kawamura 1993; Lu 1995; Lu 1998). 

The type analyses with a priori type definitions in (Barbuti and Giacobazzi 
1992; Codish and Demoen 1994; Saglam and Gallagher 1995; Codish and Lagoon 
2000; Hill and Spoto 2002) are based on bottom-up abstract interpretaion frame- 
works. They infer a type description of the success set of the program. The in- 
ferred type description is a set of type atoms each of which is a predicate sym- 
bol applied to a tuple of types. Some general remarks can be made about the 
differences between our analysis and these analyses. Firstly, our analysis is goal- 
dependent while these analyses are goal- independent. Secondly, our analysis allows 
non-deterministic type definitions that are disallowed by these analyses. Conse- 
quently, more natual typings are allowed by our type analysis than by these anal- 
yses. However, non-deterministic type definitions also make abstract operations in 
our analysis more complex than in these analyses. Thirdly, like our analysis, these 
analyses can express non-discriminative union at the level of predicates. For ex- 
ample, the two type atoms p(list (integer)) and p(tree(integer)) express the same 
information as (p(x), x e (list (integer) or tree(integer))) in our type analysis. How- 
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ever, these analyses except an informal proposal in (Barbuti and Giacobazzi 1992) 
cannot trace non-discriminative union at the level of arguments, which leads to 
imprecise analysis results. For instance, the inferred type for the concrete atom 
[1]]) is p(list(l)) according to (Codish and Demoen 1994; Saglam and Gal- 
lagher 1995; Codish and Lagoon 2000; Hill and Spoto 2002) and the main proposal 
in (Barbuti and Giacobazzi 1992). The inferred type p(list(l)) is less precise than 
(j>(x),x 6 list(integer or list (integer))) which is inferred by our type analysis. 
Lastly, as a minor note, set intersection is not used as a type constructor in these 
type analyses except (Hill and Spoto 2002). The two type clauses x(list(f3)) <— and 
x(tree((3)) <— in an abstract substitution of (Hill and Spoto 2002) indicates that 
x is both a list and a tree. Some comparisons on other aspects between our type 
analysis and these bottom-up analyses are in order. 

Barbuti and Giacobazzi (1992) infer polymorphic types of Horn clause logic pro- 
grams using a bottom- up abstract interpretation framework (Barbuti et al. 1993). 
The type description of the success set of a Horn logic program is computed as the 
least fixed-point of an abstract immediate consequence operator associated with 
the program. The abstract immediate consequence operator is defined in terms of 
abstract unification and abstract application. Abstract unification computes an ab- 
stract substitution given a term and a type. Abstract application computes a type 
given an abstract substitution and a term. Both computations are derivations of 
a Prolog program that is derived from a priori type definitions. The inferred type 
description describes only part of the success set of the program though abstract 
operations can be modified so that the type description approximates the whole 
success set. Ill-typed atoms are not described by the type description. Nor are 
those well- typed atoms that possess only ill- typed SLD resolutions. An SLD reso- 
lution is ill-typed if any of its selected atoms is ill-typed. Their type definitions are 
slightly different form ours. For instance, they define the type of the empty list [ ] as 
] — > list (A.) which is equivalent to list(0)— >[ ] in our notation whilst the empty list 
] is typed by list((3)— >[ ] in our analysis. Barbuti and Giacobazzi also informally 
introduced and exemplified an associative, commutative and idempotent operator 
U that expresses non-deterministic union at the level of types. However, abstract 
unification and abstract application operations for this modified domain of types 
are not given. In addition, it requires changing type definitions, for instance, from 
cons((3, list((3)) — ► list(/3) to cons(a, list(fi)) — > Ust(aU/3). Barbuti and Giacobazzi's 
analysis captures more type dependency than ours. This is achieved through type 
parameters. For instance, the type description for the program {p(X, [X]) ^} is 
{p(a, list(a))}. Abstract unification of the query p(X, Y) with the only type atom 
in the type description yields the abstract substitution {X i— > a, Y i— ► list(a)}, This 
kind of type dependency will be lost in our analysis. The use of type parameters 
and the use of non-discriminative union are orthogonal to each other and it is an 
interesting topic for future research to combine them for more analysis precision. 

Codish and Demoen (1994) apply abstract compilation (Hermcnegildo et al. 1992) 
to infer type dependencies by associating each type with an incarnation of the ab- 
stract domain Prop (Marriott and S0ndergaard 1989). The incarnations of Prop 
define meanings of types and capture interactions between types. The type depen- 
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dencies of a logic program is similar to the type description of the program inferred 
by the type analysis of Barbuti and Giacobazzi (1992) except that the type depen- 
dencies describe the whole success set of the program. Codish and Lagoon (2000) 
improve (Codish and Dcmoen 1994) by augmenting abstract compilation with ACI- 
unification. An associative, commutative and idempotent operator © is introduced 
to form the type of a term from the types of its sub-terms. It has the flavor of set 
union. Nevertheless, it does not denote the set union. For an example, the term 
[1, [1]] has type list(integer) © list (list (integer)) according to (Codish and Lagoon 
2000) while it has type list(integer or list(integer)) in our type analysis. Like (Bar- 
buti and Giacobazzi 1992), type analyses in (Codish and Demoen 1994; Codish and 
Lagoon 2000) capture type dependency via type parameters. In addition, they have 
the desired property of condensing which our analysis does not have. 

Hill and Spoto (2002) provide a method that enriches an abstract domain with 
type dependency information. The enriched domain contain elements like (x G 
nat) list(nat)) meaning if x has type nat then y has type list(nat). Each 

element in the enriched domain is represented as a logic program. Type analysis is 
performed by abstract compilation. Their approach to improving precision of type 
analysis is different from ours. Their domain can express type dependencies that 
ours cannot whilst our domain can express non-discriminative union at the level 
of types but theirs cannot. Hill and Spoto do not take subtyping into account in 
their design of abstract operations possibly because subtyping is outside the focus 
of their work. 

Gallagher and de Waal (1994) approximates the success set of the program by a 
unary regular logic program (Yardeni and Shapiro 1991). This analysis infers both 
type definitions and types and is incorporated into the Ciao System (Hermenegildo 
et al. 1999). Saglam and Gallagher (1995) extend (Gallagher and de Waal 1994) by 
allowing the programmer to supply deterministic type definitions for some function 
symbols. The supplied type definitions are used to transform the program and the 
transformed program is analyzed as in (Gallagher and de Waal 1994). An interesting 
topic for further study is to integrate non-deterministic type definitions and non- 
discriminative union into (Saglam and Gallagher 1995) and evaluate their impact 
on analysis precision and analysis cost. 

Finally, it is also worthy mentioning work on directional types (Bronsard et al. 
1992; Aiken and Lakshman 1994; Boye and Maluszynski 1996; Charatonik and 
Podelski 1998; Rychlikowski and Truderung 2001). Aiken and Lakshman (1994) 
present an algorithm for automatic checking directional types of logic programs. 
Directional types describe both the structure of terms and the directionality of 
predicates. A directional type for a predicate p/n is of the form 77 — > to- Type 
Tj is called an input type and type To an output type. They are type tuples of 
dimension n. The directional type expresses two requirements. Firstly, if p/n is 
called with an argument of type tj then the argument has type to upon its success. 
Secondly, each predicate q/m invoked by p is called with an argument that has the 
input type of a directional type for q/m. A program is well-typed with respect to 
a collection of directional types if each directional type in the collection is verified. 
The type checking problem is reduced to a decision problem on systems of inclusion 
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constraints over set expressions. The algorithm is sound and complete for discrim- 
inative directional types. Charatonik and Podclski (1998) provide an algorithm for 
inferring directional types with respect to which the program is well-typed. 



9 Conclusion 

We have presented a type analysis. The type analysis supports non-deterministic 
type definitions, allows set operators in type expressions, and uses a set of variable 
typings to describe type information in a set of substitutions. The analysis is pre- 
sented as an abstract domain and four abstract operations for Nilsson's abstract 
semantics (Nilsson 1988) extended to deal with negation and built-in predicates. 
These operations are defined in detail and their local correctness proved. The ab- 
stract unification involves propagation of type information downwards and upwards 
the structure of a term. Given a set of equations in solved form and an abstract 
substitution, abstract unification is accomplished in two steps. In the first step, 
more type information for variables occurring on the right-hand side of each equa- 
tion is derived from type information for the variable on the left-hand side. The 
second step derives more type information for the variable on the left-hand side 
of each equation from type information for the variables on the right-hand side. 
The abstract built-in execution operation approximates the execution of built-in 
predicates. Each built-in is modeled as a function of abstract substitutions. 

Detection of the least fixpoint and elimination of redundancy in a set of variable 
typings are both reduced to checking the emptiness of types. Though types denote 
sets of possibly non-ground terms and are not closed under set complement, check- 
ing the emptiness of types can be done by using an algorithm that checks for the 
emptiness of the types that denote sets of ground terms. An experimental study 
shows that due to a large repetition of emptiness checks, with tabling, the precision 
improvement measures incurs only a small increase in analysis time. 
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Appendix A Proofs 

Let N denote the set of natural numbers. Define h : Term(£, Var) n N as follows. 
h(x) = for all x G Var and h(f(ti,---,t n )) = 1 + max{h(ti) | 1 < i < n). 
Define h : Type n N in the same way. Note h(R) > 1 for any R 6 Type. Let 
(a; 1 , j/ 1 ) -<; {x 2 ,y 2 ) = (x 1 < x 2 ) V ((x 1 = x 2 ) A (y 1 < y 2 )). 
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Lemma 4.4. Let R £ Type and t £ Term. If t £ [R] A then er(i) <E [R] A for any 
cr £ Sufe. 

Proo/ 

The proof is done by induction on (h(t),h(R)). Let c 6 5m6 be an arbitrary sub- 
stitution. 

Basis. We have that h(t) = and that h(R) = 1. So, t £ Var which implies R = 1 
since f £ [P] A and h(R) = 1. Thus, [R] A = Term and o-(t) £ [R] A . 

Induction. Either that h(t) = or that h(t) > 0. Consider the case where h(t) = 
first. Then t £ Var. Either (i) R = 1; (ii) R = R 1 or R 2 ; or (hi) P = P x and P 2 . The 
case (i) is a special case of the base case. Consider the case (ii). We have either that 
t £ [Ri] A or that t £ [P 2 ] A . If * S [-RjIa then, by induction hypothesis, er(t) £ [Rj] A 
for j = 1,2 since h(Rj) < h(R). So, er(t) £ [R] A by the definition of [] A . The case 
(iii) is symmetric to the case (ii). Thus, a(t) £ [R] A - 

Now consider the case where h(t) > 0. Then t = f(t\, • • • , t n ). Either (i) R = 1; 
(ii) R = Ri or P 2 ; (hi) R = R\ and P 2 ; or (iv) P = c(Pi, • • • , R m ). The proof 
for that a(t) £ [R] A in the cases (i), (ii) and (iii) is the same as in the pre- 
vious paragraph. Consider the case (iv). Since t £ [R] A , there is a type rule 
c(/3i, • • • , /3 to )-o/(ti, • • • , r„) such that tj £ |k(rj)] A where k = i-> Pi , • • • , [3 m i-> 
P m }. We have that /i(k(rj)) < /i(P) and that h(tj) < h(t). By the induction hy- 
pothesis, cr(tj) £ [k(rj)] A , which together with the definition for [-] A , implies that 
a(t) £ [R] A . □ 

Lemma 5.2. -f(ASub b ) is a Moore family. 
Proo/ 

Since, 7 ([{:r i— > 1 | x £ Vp}]^) = Sub and Sub is the supremum on p(Sub), ^(ASub^) 
contains the supremum on p(Sub). Let [<Si]~, [<S 2 ]~ £ >l<Sw& b . Then [«Si]~n b [5 2 ]_ £ 
ASub^ . Furthermore, 

7([SiLn b [s 2 ]J = 7 ([5jn5i]J 

= ( U ^M) n ( U ^ VT H) 

= ( (J TVrM) n ( |J tvt(i/)) 

= 7([«5iL)n 7 ([« ! L) 

Thus, j(ASub ) is closed under n - the meet on p(Sub). So, j(ASub ) is a Moore 
family. □ 

Lemma 5.3. 7 ([<Si ® Sjj] J = 7 ([<SiLn b [S 2 y . 
Proo/ 

We first prove that 7 ([<Si <g> 5 2 ] ra ) C -yGSiLn^U- Let £ 7 ([5i ®5 2 ] ra ). Then 
there is p in (1S1 ® «S 2 ) such that 9 £ 7 vt(p)- This implies that there are /1 in <Si 
and v £ <S 2 such that p = Xx E Vp.(pi(x) and f(x)). We have 7 vt(p) C 7 vt(m) an d 
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7vt(p) Q Tvt(^), implying p £ S{ and p e 5 2 . Therefore, p £ {S[ CI S;;). Since 
6> G 7Vt(p), wc have that 6 £ 7([5 1 i n^L) and 9 £ 7([5i] (s rf[5 2 ] a3 ). 

We now prove that 7QS1 <g><S 2 L) 3 7Q<Si]~n b [S 2 ]~)- Let (9 e 7([<SiLn b [ < S 2 ] a3 ). 
Then 6 £ 7v~r(p) for some p £ (S^ CiS^) by the definition of rf . There are p £ Si and 
v £ S2 such that 7vt(p) Q 7vt(m) an d 7vt(p) ^ 7v~r(t / ), implying \/x £ Vp.(9(x) £ 
lfi(x) and is(x)] A ). Thus, 6 £ j([Si ® S 2 ]~) by the definition of (g>. □ 



Lemma 6.1. For any R £ Type and t £ Term(S, V P ), {9 \ 9{t) £ [R] A } C 
7 ([«te(i2, *)]„). 

Proof 

The proof is done by induction on (h(t), h(R)). 
Basis. (h(t), h(R)) = (0, 1). Then t £ V' P and 

vts(R, t) = {\x £ Vp.{ii x = t then R else 1)} 

The lemma holds since {9 \ 9{t) £ [R] A } = j{[vts(R,t)] a ). 

Induction. Assume that the lemma holds for all R 1 £ Type and t' £ Term(S, V P ) 
such that (h(t'),h(R')) -< (h(t),h(R)). Either (1) h(R) > 1 or (2) h(R) = 1. 

Consider the case (1). Either (i) R = R\ and R 2 or (ii) R = Ri or R 2 or (iii) 
R = c(Ri, • • • , R m ) for some m > 1. The cases (i) and (ii) are immediate. Consider 
the case (iii). By the definition of [-] A , 9(f(ti,---,t n )) £ [R] A implies that there 
is a type rule c(/?i, • • • , /3 to )-o/(ti, • • • , r„) in A such that 8{ti) £ (k(rj)] A where 



k = {/3j Rj I 1 < j < m}. We have /i(t 4 ) < h(f(h, 
hypothesis, 

£7([vts(k(Ti),ti)}J 
for all 1 < i < n. By Lemmas 5.2 and 5.3, 



, t n )). By the induction 



uts(k(Ti), ^) 



l<i<n 



) 



By the definition of vts 7 we have 

9 £ 7 ([«t8(c(fli, • • • , fl m ), /(ti, • • • , i„))L) 

Thus, the lemma holds for the case (1). 

Now consider the case (2). We have that t = f(t\, • • • , t n ). The proof is the same 
as that for the case (l).(iii). □ 



Lemma 6.3. Let S 1 = down{E,S). Then mgu(6(E)) o 9 £ 7([<S']~) for all 9 £ 

7([5]„)- 

Proof 

Let <S M = {p} ® ®(;!; =t ) e £; vts(p(x),t). It suffices to prove that mgu{a{E)) o a £ 
j([S^] a ) for all <7 £ 7v~r(A i )- rngu(a(E)) o a £ 7vt(m) as the denotation of any type 
in Type is closed under substitution. By Lemma 6.1, we have mgu(a(E)) o a £ 
"f([vts(p,(x), £)]_) for any (x = t) in E. So, mgu(a(E)) o a £ 7([<S ([t ] ;B ). □ 
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Lemma 6.5. For any r G Schm and any ki,k2 G TSub, 

(a) (ki(r) or k 2 (r)) C (ki Y k 2 )(r); and 

(b) (ki(r) and k 2 (r)) = (k x A k 2 )(r). 

Proo/ 

We prove only (a) since the proof for (b) is similar to that for (a). Let t G 
|ki(r) or k 2 (r)] A . Either (1) i G [ki(r)] A or (2) t G [k 2 (r)] A . Without loss of gener- 
ality, we assume (1). We prove t G [(ki Y k 2 )(r)] A by induction on (h(T),h(t)). 

Basis. h(r) = 0. Then r G Para and (a) holds since (ki(r) or k 2 (r)) = (ki Y k 2 )(r) 
by definition of Y. 

Induction. h(r) ^ implies that r = c(ti, • • • , r m ). If t G Var then ki(r) = 1 
and hence (k x Y k 2 )(r) = 1 and t G [(k x Y k 2 )(r)] A . Otherwise, t = /(ti, ■ ■ ■ ,t n ). 
Since t G (ki(Y)] A , there is a type rule t— >/(ti, • • • , r„) such that G |ki(Tj)] A for 
1 < i < n. We have hfa) < ft(r) and h(U) < h(t). Thus, U G [(ki Y k 2 )(r;)] A by 
the induction hypothesis and hence t G [(ki Y k 2 )(r)] A by the definition [] A . □ 

Lemma 6.7. Let /Ci,/C 2 G p(TSub), R G Type, r G Schm, R G Type* and f G 
Schm* such that \\R\\ = \\f\\. If R C orjcj e x;i (t) an d P E or^eK^^'r) then 
^»flCor ke(KlYC2 )k(rtf). 

Proo/ 

Let t • t G [P • P] A - Then i G [P] A and t G [P] A - By assumption, there are k x G K,\ 
such that t G pki(-r)] A and k 2 G /C 2 such that t G [k 2 (f)] A . Let k = ki Yk 2 . We have 
k G (/Ci Y ^2) by the definition of Y and t G [k(r)] A and t G [k(f)] A by Lemma 6.5. 
Thus, t • f G [k(T • r)] A by the definition of [] A . □ 

Lemma 6.9. Let r G Schm, P G Type and K, = cover{R 1 r). Then P C orij e K;k(r). 
Proof 

The proof is done by induction on the structure of P. 

Basis. P is atomic. P = lorP = 0orP = c(Pi, • • • , R m ) for some c/m G Cons 
and Pi , • • • , P m G Type. If P = 1 or P = then the lemma holds by the definitions 
of T, _L and [-] A . Let P = c(Pi,- ■ • ,P m ). Either (a) r G Para or (b) r = ■• -,/3 k ) 
with • • • ,0k being different type parameters in Para. In the case (a), we have 
JC = {k} with k={rH P}. The lemma holds because k(r) = P. Consider the case 
(b), if c/m = d/k then K, = {k} with k = {[3j Rj \ 1 < j < to} by the definition 
of cover and we have k(r) = P. Otherwise, /C = {k} with k = T by the definition 
of cover and k(r) = 1 by the definition of T. So, the lemma holds in the case (b). 

Induction. Either (1) P = Pi or P 2 or (2) P = Pi and P 2 . In the case (1), 
let JC\ = cover(Ri,r) and /C 2 = cover(R2,r). We have [Pi] A C UkeK [^( t )]a f° r 
1 < i < 2 by the induction hypothesis. Therefore, 

[Pi or P 2 ] A = [Pi] A U [P 2 ] A 

C (J t k ( T )lA U (J t k ( T )lA 
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U Ma 

U W t )]a 

kGcoi;er(i?,r) 

So the lemma holds for the case (1). Consider the case (2). Let K.\ = cover (R±,t) 
and K.2 = cover(R 2 ,r). We have [Ri] A C UkeiC, [M t )]a for 1 < « < 2 by the 
induction hypothesis. So, 

[Pi and R 2 ] A = [Rr] A n [P 2 ] A 

C (J [k(r)] A n |J [k(r)] A 

kG/Ci kG/C 2 

U t k ( T )]A by Lemma 6.5. (b) 

U N^A 

kG couer(.R,T) 

Thus, the lemma holds for the case (2). □ 



Lemma 6.11. Let t G Term(S, Vp) and ju e (Vp m Type). Then 0(f) G [type(t, fi)] A 

for all G 7vt(m)- 

Proo/ 

The proof is done by induction on /i(t). 

Basis. h(t) = 0. Then t G Vp and type(t,fi) = u(t). The lemma holds. 

Induction. > 0. Let f = /(ti, ■ • • , t n ) and Pi = type(t i} [i) for i G {1, • • • , n} 
and 9 G tvt (/•*)■ By the induction hypothesis, we have 6(U) E [Ri] A for all 1 < i < n. 
Let t->/(ti, • ■ • ,Tj) be a type rule in A and /Cj = cover{R il Ti). By Lemma 6.9, 
Ma C Uk«J^)k- Thus, p!,.-.,^}^ C [] ke{Yi ^ n!Ci) H(ri,---,T n ))] A 
by Lemma 6.7, which implies 0(f) E Uke(Yi< < K <) [ k ( T )lA the definition of [-] A . 
This is true of each type rule for f/n. Therefore, 9(t) E [type(t, /x)] A . □ 



Lemma 6.13. Let S E p(V P Type) and E E p(Eqn). Then mgu{0{E)) o G 
7([up(£?,5)] (a ) for all G 7 ([S]J. 

Proo/ 
Let 

/ i/ 3i.(a; = t) E E \ 
ji' = \x E Vp. then fi(x) and type(t,fj,) 

\ else fj,(x) j 

It suffices to prove that mgu(a(E))oa G Tvt(m') for all a G 7vt(a*)- By Lemma 6.11, 
cr(i) G [type(t, /i)] A . We have (mgu(a(E)) o o")(a:) G [i?/pe(i, /u)] A for all x and i such 
that (ar = i) G P. Therefore, (mgu(a(E)) o a) E 7vt(a0- □ 

Theorem 6.16. For any [<Si]~, [£2]- G ASub b and any ai,a 2 £ Atomp, 
tf/(ai,7([Si]J,a2,7([S2L)) ^ 7( ^/V. [SiL, a 2 , [S 2 ]J) 
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Proof 

We first prove a preliminary result on substitution and unification. Let 77, 9 G Sub 
and E,E\,E 2 G p(Eqn) and assume that 9 — mgu(w(E)) o r\ ^ fail. Recall that 
mgu(Ei U E 2 ) = mgu(Ei U eq(mgu(E 2 ))) and mgu(r](E)) = mgu{eq{rj) U E) (Eder 
1985). Then 

mgu(9(E))o9 = mgu(mgu(r)(E)) o rj(E)) o mgu(r)(E)) o 77 
= mgu(mgu(r)(E))(ri(E))) o mgu(w(E)) o r) 
= mgu(eq(mgu(r](E))) U eq{rj) U E) o mgu{n{E)) o 77 
= mgu(eq(mgu(eq(ri) U £7)) U 6(7(77) U£)o mgu(n(E)) o 77 
= mgu(eq(ri) U£U 6(7(77) U £7) o mgu(r](E)) o 77 
= mgu{eq(rf) Li E) o mgu{n{E)) o 77 
= mgu(rj(E)) o mgu{n{E)) o r\ 
= mgu(r](E)) o 77 
= 

We are now ready to prove the theorem. Let #1 6 7([<Si] as ), #2 £ 7(^2]-) and 
-Eo = eqo mgu{^> (a\) , a 2 ) . Assume that uf{a\, 6\, a 2 , 02) ^ fail. It is equivalent 
to prove uf(ai,9i,a2,9 2 ) G ■y(Uf b (ai, [Si]^,a 2 , [S 2 ]~))- By the definition of 7 and 
rest, if C € 7([S] a3 ) then £ G 7([ re st(iS)]_) for any substitution £ and any set 
of variable typings over V' p . Thus, it suffices to prove that uf(ai,9i,a 2 ,9 2 ) G 
■y([up(E , down(E ,^(Si)[t)S2))]„) by the definitions for Uf b and sotoe. With- 
out loss of generality, assume that VP renames 6\(ai) apart from 9 2 (a 2 ). Let 77 = 
6 2 U *(6>i) and 9 = mgu(rj(E a )) o 77. Then 

uf(a 1 ,e 1 ,a 2 ,e 2 ) G 7(M-Eo, down(E , *(5i) |+J S 2 ))] J 

<-► mffu((*(fli))(*(ai)), 2 (a 2 )) ° #2 G 7(M^o, down{E , * (Si) J 

mgu(n(E )) 077 g 7(^(^0, down(E , tf(Si) [+J <S 2 ))]J 

« 7(M#o, down{E , * (Si) |±|S 2 ))] J 

Thus, it remains to prove G 7([iip(.Eo, down(Eo, ^(Si) |+) <S 2 ))] RJ ). Since 77 G 
7(*(<Si) 1+) S 2 ) and 9 = mgu{n{E Q ))ori, it holds that 6 G 7([dow77(£ , * (Si) 1+J S 2 )]J 
according to Lemma 6.3. According to Lemma 6.13, we have mgu(0(E)) o 9 G 
7([«p(£o,rfown( J E;o,*(Si)|+)<S 2 ))] !B ). Note that mgu(0(Ej) o 9 = 9. Thus, G 
7 ([^(£ ,rfown(£:o,*(Si)|+)«S 2 ))] ra ). □ 

Theorem 7.4. For any term i in Term(£,Var) and any type R in Type, t G [i?] A 

iffx(*) e ((iE» A . 

Proof 

We first prove necessity. Assume that t G [R] A - We prove that x(*) G ((R)) A by 
induction on (h(t),h(R)}. 

Basis. /i(t) = and h(R) = 1. We have that £ G Var and that R = 1 by the 
definition of [] A . Thus, x(*) = (? G ((i?)) A . 
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Induction. Either h(t) = and h(R) > 1 or h(t) > and h(R) > 1. Consider first 
the case where h(t) = and h(R) > 1. Then t £ Var and either (i) R = (R\ or R 2 ); 
or (ii) i? = and i?2)- We only prove the case (i) since the case (ii) is dual 
to the case (i). Since t £ [R] A , cither t £ [Ri]& or t £ [^2] A - So, we have either 
X(t) G ((Ri)) A or x{t) £ ((^2)) A b y tnc induction hypothesis. Thus, x(t) £ ((R)) A - 

Now consider the case h(t) > and h(R) > 1. Then t = f(t\, ■ ■ ■ ,t n ). Either (i) 
R = {Ri or R 2 )\ (ii) R = (i?i and R 2 ); (iii) R = 1 or (iv) = c(fli, • • • , R m ). The 
proof for that \(t) £ ((R))a m cases (i) an( i (ii) are the same as in the previous 
paragraph. The case (iii) is vacuous. Consider the case (iv). Since t £ [R] A , by the 
definition of [-] A , there is a type rule c(/?i, • • • , /Jm^/^i, • ■ ■ , r„) in A such that 

G [k(rj)] A for all 1 < i < n where k = {(3i 1— > • • • ,/3 m 1— > R m }- Observe that 
h(ti) < h(t) and h(k(n)) < h(R). By induction hypothesis, x(U) £ {{^(. T i))) a- By 
the definition of ((-)) A , t £ ((R)) A . 

We now prove sufficiency. Assume that x(t) € ((R))a- We prove that i G [i?] A by 
induction on (/i(t), h(R)). 

Basis. = and = 1. Then t £ Var and x(t) = Q- We have that R = 1 
by the definition of ((-)) A . Thus, t G [i?] A . 

Induction. Either h(t) = and h(R) > 1 or /i(i) > and h(R) > 1. Consider first 
the case where /i(£) = and ft.(i?) > 1. Then t £ Var and either (i) R = (Ri or R 2 ); 
or (ii) R = (Ri and R 2 ). We only prove the case (i) since the case (ii) is dual to the 
case (i). Since x(t) £ {{R)) A , either x(t) £ {{Ri)) A or x(t) G ({R 2 )) A . So, we have 
either t £ [Ri] A or t £ [R 2 ] A by the induction hypothesis. Thus, t £ [R] A - 

Now consider the case h(t) > and h(R) > 1. Then t = f(t\,---,t n ). Either 
(i) R = (Ri or R 2 ); (ii) R = {Ri and R 2 ); (iii) i? = 1 or (iv) i? = c(R u • • • , i? m ). 
The proof for that t £ [R] A in cases (i) and (ii) are the same as in the previous 
paragraph. The case (iii) is vacuous. Consider the case (iv). Since x(i) G ((R)) A , 
by the definition of ((-)) A , there is a type rule c(/3i, • • • , j3 m )— o/(ti, • • • , r„) in A 
such that x(*i) € ((Ik( T 'i)))A f° r all 1 < i < n where k = {/3\ 1— > Ri,---,(3 m 1— > 
Rm}- Observe that hifi) < h(t) and /i(k(rj)) < h(R). By induction hypothesis, 
k £ [k(r,)] A . By the definition of [] A , X (t) £ [R] A - □ 

Corollary 7.5. For any R U R 2 £ Type, [i?J A C [R 2 ] A iff ((R 1 )) A C ((J? 2 » A . 
Proo/ 

Both sufficiency and necessity are proved by contradiction. We first consider suffi- 
ciency. Assume that ((Ri)) A Q ((R 2 )) A but [Ri] A % [^a- Then there is a term t 
such that t £ [Ri] A and t [R 2 ] A - By Theorem 7.4, we have that x(*) & ((Ri)) A - By 
the assumption, x{t) G ((R2)) A - By Theorem 7.4, i G [i? 2 ] A that contradicts with 
that t <£ [R 2 ] A . 

We now prove necessity. Assume that [Ri] A C [-R 2 ]a but ((Ri)) A 2 ((^2)) A - Then 
there is a term i such that t £ {{Ri)) A and i G - ((R 2 )) A . By Theorem 7.4, there is 
a term t' such that x{t') = * an( i € [^i]a- tn c assumption, t' G [-R 2 ] A - By 
Theorem 7.4, t £ ((R 2 )} A that contradicts with that t £ ((R 2 )) A . □ 
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